Member-only story

Get with the Times: PyTabKit for better Tabular Machine Learning over Sk-Learn (CODE Included)

For too long has Scikit-Learn been the go-to library for machine learning on tabular data, offering a broad collection of algorithms, preprocessing utilities, and model evaluation tools. Yes, it is still perfect, but why continue to use your grandfather’s run down ‘58 chevy, let it remain an antique. Enter PyTabKit — a new framework designed to replace Scikit-Learn for classification and regression on tabular data, leveraging cutting-edge techniques like RealMLP and improved default hyperparameters for GBDTs.

Full Article link: 2407.04491
Citation: @inproceedings{holzmuller2024better,title={Better by default: {S}trong pre-tuned {MLPs} and boosted trees on tabular data}, author={Holzm{\"u}ller, David and Grinsztajn, Leo and Steinwart, Ingo}, booktitle = {Neural {Information} {Processing} {Systems}},year={2024}}

Why Move Beyond Scikit-Learn?

Scikit-Learn provides a solid foundation for model development, but it lacks highly optimized deep learning methods and efficient auto-tuned hyperparameters. Recent research has demonstrated that:

RealMLP Can Rival GBDTs

  • Deep learning models for tabular data have traditionally required extensive…
Writing in the World of Artificial Intelligence
Writing in the World of Artificial Intelligence

Published in Writing in the World of Artificial Intelligence

Open AI’s ChatGPT has taken over as the next invention of ‘fire’, but that is not all that’s new, stay up to date with the latest and greatest that's what we cover here.

Abish Pius
Abish Pius

Written by Abish Pius

Data Science Professional, Python Enthusiast, turned LLM Engineer

No responses yet

Write a response