Skip to main content

Classification Benchmarks

Model Version Details

Benchmarks on this page refer to a specific model version, details about each version are contained here:

model versiontraining algoinput sourcesinput sizedimensionswindow sizerelease status
en0.14.0GloVeWiki~14G60015released
en0.16.0GloVeWiki, CommonCrawl~1000G30015released
en0.17.0fasttextWiki~14G30015not released (yet)
en0.17.1fasttextWiki~14G3005not released (yet)
en0.17.2fasttextWiki, CommonCrawl~1000G30015training aborted!
en0.17.3fasttextWiki, CommonCrawl~100G30015not released (yet)

Benchmarks (KNN)

Enron Emails (Subset kaminski-v)

  • Source Repo: semi-technologies/enron-email-classification
  • Current best: en0.14.0 at k=1
contextionarydimensionsk=1k=3k=5k=8k=13k=21
en0.14.0-v0.4.960074%72%71%70%67%63%
en0.16.0-v0.4.930072%70%69%69%65%64%
en0.17.0-v0.4.1530068%68%67%64%63%60%
en0.17.1-v0.4.1530070%68%68%66%64%62%
en0.17.3-v0.4.1530072%70%70%69%66%64%

20 Newsgroups

  • Size: 60 per category
  • Source Repo semi-technologies/20news-classification

Main Category (6 Categories)

  • Current best: en0.17.3 at k=5
contextionarydimensionsk=1k=3k=5k=8k=13k=21
en0.14.0-v0.4.1560076%73%72%74%74%70%
en0.16.0-v0.4.1530083%82%80%82%82%82%
en0.17.0-v0.4.1530078%80%77%77%73%72%
en0.17.1-v0.4.1530077%77%78%77%73%73%
en0.17.3-v0.4.1530083%84%85%82%81%80%

Fine Category (20 Categories)

  • Current best: en0.17.3 at k=1
contextionarydimensionsk=1k=3k=5k=8k=13k=21
en0.14.0-v0.4.1560057%53%53%50%48%46%
en0.16.0-v0.4.1530062%60%57%60%61%59%
en0.17.0-v0.4.1530057%57%56%56%56%51%
en0.17.1-v0.4.1530056%54%55%58%54%53%
en0.17.3-v0.4.1530066%64%64%61%62%61%

Benchmarks (contextual)

20 Newsgroups

  • Size: 60 per category
  • Source Repo: semi-technologies/20news-classification
  • Warning: Take these results (20-news contextual) with a grain of salt, they are not currently testing the best possible hyper-parameters, but just a specific configuration that worked well in the past. TODO: Improve benchmark to test various hyper-parameters

Main Category (6 Categories)

  • Current best: en0.14.0
contextionarydimensionsresult
en0.14.0-v0.4.1560054%
en0.16.0-v0.4.1530050%
en0.17.0-v0.4.1530050%
en0.17.1-v0.4.1530050%
en0.17.3-v0.4.1530050%

Fine Category (20 Categories)

  • Current best: en0.16.0
contextionarydimensionsresult
en0.14.0-v0.4.1560044%
en0.16.0-v0.4.1530056%
en0.17.0-v0.4.1530044%
en0.17.0-v0.4.1530043%
en0.17.3-v0.4.1530050%

More Resources

{% include docs-support-links.html %}