11. • similarity: tf-idf, bm25, lmd, …
• docs: by sentences, by paragraph, by articles
• minimal size: 50, 100, 150, 200, 300
• stemming: yes, no
• stop words: yes, no
• synonyms: yes, no
• corpus: ck12, science wiki, all wiki, …
12. • similarity: tf-idf, bm25, lmd
• docs: by sentences, by paragraph, by articles
• minimal size: 50, 100, 150, 200, 300
• stemming: yes, no
• stop words: yes, no
• synonyms: yes, no
• corpus: ck12, science wiki, all wiki, …Let’s try all we can
21. • Which of the following is not a real chemical
element?
• Hydrogen
• Oxygen
• Gold
• Watermelon
22. • A widow's peak is inherited by anyone with a
dominant allele, while a straight hairline is inherited
only when the offspring receives recessive alleles
from both parents. Which of the following genotypes
is MOST likely to have a straight hairline phenotype?
• ww
• wW
• Ww
• WW
23. • A widow's peak is inherited by anyone with a
dominant allele, while a straight hairline is inherited
only when the offspring receives recessive alleles
from both parents. Which of the following genotypes
is MOST likely to have a straight hairline phenotype?
• ww -> ww
• wW -> ww
• Ww -> ww
• WW -> ww
25. Dataset is a bitch
• NNs require lots of data
• 2.5k q&a
26. Ideas
• Use pre trained embeddings - word2vec
• Use dumb networks
• Get more data - data augmentation
• Add IR results as memory (MemN2N)
27. Best Architecture
• Simplified architecture
• Few neurons: embedding=10, lstm=32
• Max pooling
• Wrong answers shuffling
• Allen AI 4th grade datasets