Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
LEXISNEXIS,
NCSTATE
OPPORTUNITIES
TIM MENZIES
COMPUTER SCIENCE,
JUNE 2015
SEBIG LAB : SE FOR BIG DATA
• Three year partnership
• New lab to explore SE methods for big data apps.
• Grow skill set o...
LAB PROCESSES VS
INDUSTRIAL PROCESSES
• Lab processes
• Make 10ml of
oxygen?
• Easy!
• Make 100,000 liters
per day?
• That...
INDUSTRIAL PROCESSES FOR
DATA MINING
4
INDUSTRIAL PROCESSES
FOR DATA MINING
5
1
23
4
5
EXPLORING NEW ALGORITHMS
• New ideas
• SVM
• Deep learning
• Ensembles
• etc
• Visualizations
• Parameter tuning
• Synonym...
VALIDATION STUDIES
• Independent checks of industrial results
• Optimizing validation:
• ? Mechanical Turk
• Better suppor...
CAN WE MAKE BETTER
USE OF OLD KNOWLEDGE?
• Learning domain ontologies.
• Corpus definition.
• How to revise old knowledge?...
SUPPORT
Gather case study data
Synthetic studies
Annonymization of data
Training
• Papers
• Tutorials
• Learning informati...
LESS IS MORE
• Reasoning via fewer, most representative
examples
• Active learning
• Early stopping
• Stack ranking (early...
Upcoming SlideShare
Loading in …5
×

Lexisnexis june9

3,737 views

Published on

Published in: Education
  • Be the first to comment

  • Be the first to like this

Lexisnexis june9

  1. 1. LEXISNEXIS, NCSTATE OPPORTUNITIES TIM MENZIES COMPUTER SCIENCE, JUNE 2015
  2. 2. SEBIG LAB : SE FOR BIG DATA • Three year partnership • New lab to explore SE methods for big data apps. • Grow skill set of engineers: • Assess different approaches to Big Data • Validation of results 2
  3. 3. LAB PROCESSES VS INDUSTRIAL PROCESSES • Lab processes • Make 10ml of oxygen? • Easy! • Make 100,000 liters per day? • That’s another matter 3
  4. 4. INDUSTRIAL PROCESSES FOR DATA MINING 4
  5. 5. INDUSTRIAL PROCESSES FOR DATA MINING 5 1 23 4 5
  6. 6. EXPLORING NEW ALGORITHMS • New ideas • SVM • Deep learning • Ensembles • etc • Visualizations • Parameter tuning • Synonym discovery • Incremental association rule learning 6 1
  7. 7. VALIDATION STUDIES • Independent checks of industrial results • Optimizing validation: • ? Mechanical Turk • Better support tools for coding new functionality • Better test suites for certifying new functionality 7 2
  8. 8. CAN WE MAKE BETTER USE OF OLD KNOWLEDGE? • Learning domain ontologies. • Corpus definition. • How to revise old knowledge? • The privileged review problem. • Transfer learning. 8 3
  9. 9. SUPPORT Gather case study data Synthetic studies Annonymization of data Training • Papers • Tutorials • Learning information seeking behavior 9 4
  10. 10. LESS IS MORE • Reasoning via fewer, most representative examples • Active learning • Early stopping • Stack ranking (early stop) 10 5

×