SymbolFit: Automatic Parametric Modeling with Symbolic Regression
Created by W.Langdon from
gp-bibliography.bib Revision:1.8886
- @Article{Tsoi:2025:CS4BS,
-
author = "Ho Fung Tsoi and Dylan Rankin and Cecile Caillol and
Miles Cranmer and Sridhara Dasu and Javier Duarte and
Philip Harris and Elliot Lipeles and Vladimir Loncar",
-
title = "{SymbolFit}: Automatic Parametric Modeling with
Symbolic Regression",
-
journal = "Computing and Software for Big Science",
-
year = "2025",
-
volume = "9",
-
pages = "article number 12",
-
keywords = "genetic algorithms, genetic programming",
-
URL = "
https://rdcu.be/eXDVR",
-
DOI = "
10.1007/s41781-025-00140-9",
-
code_url = "
https://github.com/hftsoi/symbolfit",
-
size = "45 pages",
-
abstract = "We introduce SymbolFit (API:
https://github.com/hftsoi/symbolfit), a framework that
automates parametric modelling by using symbolic
regression to perform a machine-search for functions
that fit the data while simultaneously providing
uncertainty estimates in a single run. Traditionally,
constructing a parametric model to accurately describe
binned data has been a manual and iterative process,
requiring an adequate functional form to be determined
before the fit can be performed. The main challenge
arises when the appropriate functional forms cannot be
derived from first principles, especially when there is
no underlying true closed-form function for the
distribution. we develop a framework that automates and
streamlines the process by using symbolic regression, a
machine learning technique that explores a vast space
of candidate functions without requiring a predefined
functional form because the functional form itself is
treated as a trainable parameter, making the process
far more efficient and effortless than traditional
regression methods. We demonstrate the framework in
high-energy physics experiments at the CERN Large
Hadron Collider (LHC) using five real proton-proton
collision datasets from new physics searches, including
background modeling in resonance searches for high-mass
dijet, trijet, paired-dijet, diphoton, and dimuon
events. We show that our framework can flexibly and
efficiently generate a wide range of candidate
functions that fit a nontrivial distribution well using
a simple fit configuration that varies only by random
seed, and that the same fit configuration, which
defines a vast function space, can also be applied to
distributions of different shapes, whereas achieving a
comparable result with traditional methods would have
required extensive manual effort.",
- }
Genetic Programming entries for
Ho Fung Tsoi
Dylan Rankin
Cecile Caillol
Miles Cranmer
Sridhara Dasu
Javier Duarte
Philip Harris
Elliot Lipeles
Vladimir Loncar
Citations