SymbolFit: Automatic Parametric Modeling with Symbolic Regression
Created by W.Langdon from
gp-bibliography.bib Revision:1.8721
- @Article{Tsoi:2025:CS4BS,
-
author = "Ho Fung Tsoi and Dylan Rankin and Cecile Caillol and
Miles Cranmer and Sridhara Dasu and Javier Duarte and
Philip Harris and Elliot Lipeles and Vladimir Loncar",
-
title = "{SymbolFit}: Automatic Parametric Modeling with
Symbolic Regression",
-
journal = "Computing and Software for Big Science",
-
year = "2025",
-
volume = "9",
-
pages = "article number 12",
-
keywords = "genetic algorithms, genetic programming",
-
URL = "
https://rdcu.be/eXDVR",
-
DOI = "
10.1007/s41781-025-00140-9",
-
code_url = "
https://github.com/hftsoi/symbolfit",
-
size = "45 pages",
-
abstract = "We introduce SymbolFit (API:
https://github.com/hftsoi/symbolfit), a framework that
automates parametric modelling by using symbolic
regression to perform a machine-search for functions
that fit the data while simultaneously providing
uncertainty estimates in a single run. Traditionally,
constructing a parametric model to accurately describe
binned data has been a manual and iterative process,
requiring an adequate functional form to be determined
before the fit can be performed. The main challenge
arises when the appropriate functional forms cannot be
derived from first principles, especially when there is
no underlying true closed-form function for the
distribution. In this work, we develop a framework that
automates and streamlines the process by using symbolic
regression, a machine learning technique that explores
a vast space of candidate functions without requiring a
predefined functional form because the functional form
itself is treated as a trainable parameter, making the
process far more efficient and effortless than
traditional regression methods. We demonstrate the
framework in high-energy physics experiments at the
CERN Large Hadron Collider (LHC) using five real
proton-proton collision datasets from new physics
searches, including background modeling in resonance
searches for high-mass dijet, trijet, paired-dijet,
diphoton, and dimuon events. We show that our framework
can flexibly and efficiently generate a wide range of
candidate functions that fit a nontrivial distribution
well using a simple fit configuration that varies only
by random seed, and that the same fit configuration,
which defines a vast function space, can also be
applied to distributions of different shapes, whereas
achieving a comparable result with traditional methods
would have required extensive manual effort.",
- }
Genetic Programming entries for
Ho Fung Tsoi
Dylan Rankin
Cecile Caillol
Miles Cranmer
Sridhara Dasu
Javier Duarte
Philip Harris
Elliot Lipeles
Vladimir Loncar
Citations