Gdzie można nakarmić
sztuczną inteligencję?
dr inż. Aleksander Smywiński-Pohl
Akademia Górniczo-Hutnicza, Techmo
apohllo.pl
https://goo.gl/Im3
bk7
Algorytmy uczenia maszynowego są głodne
deeplearningbook.org:
“As of 2016, a rough rule of thumb is that a supervised
deep learning algorithm will generally achieve acceptable
performance with around 5,000 labeled examples per
category, and will match or exceed human performance
when trained with a dataset containing at least 10 million
labeled examples.
https://goo.gl/Eu
2vYq
Zapłać ludziom, żeby nakarmili twoje algorytmy...
Mechanical Turk is a marketplace for work
https://goo.gl/n1
girM
albo...
...zamień użytkowników w darmowych pracowników
https://goo.gl/PGFAvO
“The system has been
reported as displaying
over 100 million
CAPTCHAs every day”
Wikipedia
Kup gotowe zbiory danych
https://www.ldc.upenn.edu/
LDC93S1 TIMIT Acoustic-Phonetic Continuous Speech Corpus
DC2006T13 Web 1T 5-gram Version 1
LDC96L14 CELEX2
LDC2013T19 OntoNotes Release 5.0
LDC93S10 TIDIGITS
LDC99T42 Treebank-3
LDC2008T19 The New York Times Annotated Corpus
LDC93T3A TIPSTER Complete
LDC97S62 Switchboard-1 Release 2
LDC2016T19 BOLT Chinese-English Word Alignment and Tagging
albo...
Wykorzystaj “darmowe” zbiory danych
https://goo.gl/3Tk
WSJ
Gdzie szukać?
http://lod-cloud.net/
https://goo.gl/KM1kv0

AIMeetup #2: Gdzie można nakarmić sztuczną inteligencję?

  • 1.
    Gdzie można nakarmić sztucznąinteligencję? dr inż. Aleksander Smywiński-Pohl Akademia Górniczo-Hutnicza, Techmo apohllo.pl https://goo.gl/Im3 bk7
  • 2.
    Algorytmy uczenia maszynowegosą głodne deeplearningbook.org: “As of 2016, a rough rule of thumb is that a supervised deep learning algorithm will generally achieve acceptable performance with around 5,000 labeled examples per category, and will match or exceed human performance when trained with a dataset containing at least 10 million labeled examples. https://goo.gl/Eu 2vYq
  • 3.
    Zapłać ludziom, żebynakarmili twoje algorytmy... Mechanical Turk is a marketplace for work https://goo.gl/n1 girM
  • 4.
  • 5.
    ...zamień użytkowników wdarmowych pracowników https://goo.gl/PGFAvO “The system has been reported as displaying over 100 million CAPTCHAs every day” Wikipedia
  • 6.
    Kup gotowe zbiorydanych https://www.ldc.upenn.edu/ LDC93S1 TIMIT Acoustic-Phonetic Continuous Speech Corpus DC2006T13 Web 1T 5-gram Version 1 LDC96L14 CELEX2 LDC2013T19 OntoNotes Release 5.0 LDC93S10 TIDIGITS LDC99T42 Treebank-3 LDC2008T19 The New York Times Annotated Corpus LDC93T3A TIPSTER Complete LDC97S62 Switchboard-1 Release 2 LDC2016T19 BOLT Chinese-English Word Alignment and Tagging
  • 7.
  • 8.
    Wykorzystaj “darmowe” zbiorydanych https://goo.gl/3Tk WSJ
  • 12.
  • 13.
  • 14.