SlideShare a Scribd company logo
(aomushi510)




2009   11   24
aomushi
                 •         perl

                 •



                 • casual-perl IRC




2009   11   24
aomushi


                 •     JPA




2009   11   24
• Algorithm::NaiveBayes
                       OK/NG




2009   11   24
Algorithm::NaiveBayes
                 • http://search.cpan.org/~kwilliams/
                   Algorithm-NaiveBayes-0.04/lib/
                   Algorithm/NaiveBayes.pm
                 •



                 • AI::Categorizer
                       Lingua::JA::Categorize

2009   11   24
2009   11   24
aomushi       •   Mecab




2009   11   24
aomushi


  aomushi
                         ,            ,    ,*,*,*,*
       
                 ,               ,*,*,*,*, , , ,,
                
                  ,            ,*,*,*,*,   ,           ,
                                                                             •   Mecab
                              ,,
       
                 ,               ,     ,*,*,*, , , ,,
                
                  ,            ,*,*,*,*,   ,           ,
                         ,,
       
                 ,             ,*,*,           ,        ,   , , ,,
       
                 ,                   ,*,*,*,*, , , ,,


2009       11       24
aomushi                                                              • NaiveBayes


  aomushi
                    ,       ,       ,*,*,*,*
            
                 ,       ,*,*,*,*,        ,       ,
            
                 ,       ,*,*,*,*,        ,       ,
                         ,,
                                                                        •
                     
        ,   ,*,*,*,*,        ,       ,       ,,




2009   11       24
28            for (my $node = $self->mecab->parse($text); $node; $node = $node->next) {
   29               my $info = $node->feature;
   30               my $word = $node->surface;
   31               next unless $info;
   32               if ( $info =~ /^   /){
   33                  next
   34                   if $info =~ / |        |     |       |    |          /;
   35                  next if List::MoreUtils::any { $word eq $_ } @{ $self->_skip_word };
   36                  $data->{$word}++;
   37              }
   38            }
   39            return $data;




2009   11   24
mecab
                 • naist-dic wikipedia

                     • deepneko



                     • http://deepneko.dyndns.org/
                       kokotech/2009/06/
                       mecabwikipedia.html
                 •             NG
2009   11   24
•

                     • NG

                     • OK   NG



                     •       10000


2009   11   24
59            while ( my ( $label, $ref ) = each %$categories ) {
   60              my $words = $self->_get_words($ref->{display});
   61              foreach (@$words) {
   62                 my $tokenizer = MyFilter::Util::Tokenizer->new;
   63                 my $word_set = $tokenizer->tokenize($_, $self->threshold);
   64
   65                 $brain->add_instance(
   66                    attributes => $word_set,
   67                    label => $label,
   68                 );
   69              }
   70              $brain->train;
   71            }
   72            $brain->save_state($save_file) if $save_file;


2009   11   24
31 sub categorize {
   32    my ($self, $word_set) = @_;
   33
   34    return $self->brain->predict( attributes =>
  $word_set );
   35 }




2009   11   24
• bad


  $result = {
     good => 0.092,
     bad => 0.996,
  };

2009   11   24
•



                     •

                     • ao shi

                 •



2009   11   24
•




2009   11   24
•




2009   11   24
•




2009   11   24
2009   11   24
2009   11   24
P-1

                 •       200       NG
                     (         )



                 •




2009   11   24
3



2009   11   24
3




2009   11   24
2



2009   11   24
2




2009   11   24
1



2009   11   24
1




2009   11   24
•

                     • Algorithm::NaiveBayes

                     • mecab

                 • yusukebe



2009   11   24
2009   11   24
•   Algorithm::NaiveBayes

                     •   http://search.cpan.org/~kwilliams/Algorithm-NaiveBayes-0.04/
                         lib/Algorithm/NaiveBayes.pm

                 •   mecab         wikipedia

                     •   http://deepneko.dyndns.org/kokotech/2009/06/
                         mecabwikipedia.html

                 •   Lingua::JA::Categorize

                     •   http://search.cpan.org/~miki/Lingua-JA-Categorize-0.01001/
                         lib/Lingua/JA/Categorize.pm



2009   11   24

More Related Content

Similar to Casual-Talk #1 青虫の生態について

Programming Contest Hacks
Programming Contest HacksProgramming Contest Hacks
Programming Contest Hacks
Kosei Moriyama
 
Java7 normandyjug
Java7 normandyjugJava7 normandyjug
Java7 normandyjug
Normandy JUG
 
次世代シーケンサのデータ解析 技術開発編
次世代シーケンサのデータ解析 技術開発編次世代シーケンサのデータ解析 技術開発編
次世代シーケンサのデータ解析 技術開発編
mickey24
 
仙台Ruby会議02 LT
仙台Ruby会議02 LT仙台Ruby会議02 LT
仙台Ruby会議02 LT
Shin-ichiro OGAWA
 
Ecos基础应用介绍
Ecos基础应用介绍Ecos基础应用介绍
Ecos基础应用介绍
wanglei999
 
Cache on Delivery
Cache on DeliveryCache on Delivery
Cache on Delivery
SensePost
 
Leadership Guide, 초보팀장을 위한 리더십 가이드
Leadership Guide, 초보팀장을 위한 리더십 가이드Leadership Guide, 초보팀장을 위한 리더십 가이드
Leadership Guide, 초보팀장을 위한 리더십 가이드
Jinho Jung
 

Similar to Casual-Talk #1 青虫の生態について (7)

Programming Contest Hacks
Programming Contest HacksProgramming Contest Hacks
Programming Contest Hacks
 
Java7 normandyjug
Java7 normandyjugJava7 normandyjug
Java7 normandyjug
 
次世代シーケンサのデータ解析 技術開発編
次世代シーケンサのデータ解析 技術開発編次世代シーケンサのデータ解析 技術開発編
次世代シーケンサのデータ解析 技術開発編
 
仙台Ruby会議02 LT
仙台Ruby会議02 LT仙台Ruby会議02 LT
仙台Ruby会議02 LT
 
Ecos基础应用介绍
Ecos基础应用介绍Ecos基础应用介绍
Ecos基础应用介绍
 
Cache on Delivery
Cache on DeliveryCache on Delivery
Cache on Delivery
 
Leadership Guide, 초보팀장을 위한 리더십 가이드
Leadership Guide, 초보팀장을 위한 리더십 가이드Leadership Guide, 초보팀장을 위한 리더십 가이드
Leadership Guide, 초보팀장을 위한 리더십 가이드
 

Recently uploaded

Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
jpupo2018
 

Recently uploaded (20)

Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
 

Casual-Talk #1 青虫の生態について

  • 2. aomushi • perl • • casual-perl IRC 2009 11 24
  • 3. aomushi • JPA 2009 11 24
  • 4. • Algorithm::NaiveBayes OK/NG 2009 11 24
  • 5. Algorithm::NaiveBayes • http://search.cpan.org/~kwilliams/ Algorithm-NaiveBayes-0.04/lib/ Algorithm/NaiveBayes.pm • • AI::Categorizer Lingua::JA::Categorize 2009 11 24
  • 6. 2009 11 24
  • 7. aomushi • Mecab 2009 11 24
  • 8. aomushi aomushi , , ,*,*,*,* , ,*,*,*,*, , , ,, , ,*,*,*,*, , , • Mecab ,, , , ,*,*,*, , , ,, , ,*,*,*,*, , , ,, , ,*,*, , , , , ,, , ,*,*,*,*, , , ,, 2009 11 24
  • 9. aomushi • NaiveBayes aomushi , , ,*,*,*,* , ,*,*,*,*, , , , ,*,*,*,*, , , ,, • , ,*,*,*,*, , , ,, 2009 11 24
  • 10. 28 for (my $node = $self->mecab->parse($text); $node; $node = $node->next) { 29 my $info = $node->feature; 30 my $word = $node->surface; 31 next unless $info; 32 if ( $info =~ /^ /){ 33 next 34 if $info =~ / | | | | | /; 35 next if List::MoreUtils::any { $word eq $_ } @{ $self->_skip_word }; 36 $data->{$word}++; 37 } 38 } 39 return $data; 2009 11 24
  • 11. mecab • naist-dic wikipedia • deepneko • http://deepneko.dyndns.org/ kokotech/2009/06/ mecabwikipedia.html • NG 2009 11 24
  • 12. • NG • OK NG • 10000 2009 11 24
  • 13. 59 while ( my ( $label, $ref ) = each %$categories ) { 60 my $words = $self->_get_words($ref->{display}); 61 foreach (@$words) { 62 my $tokenizer = MyFilter::Util::Tokenizer->new; 63 my $word_set = $tokenizer->tokenize($_, $self->threshold); 64 65 $brain->add_instance( 66 attributes => $word_set, 67 label => $label, 68 ); 69 } 70 $brain->train; 71 } 72 $brain->save_state($save_file) if $save_file; 2009 11 24
  • 14. 31 sub categorize { 32 my ($self, $word_set) = @_; 33 34 return $self->brain->predict( attributes => $word_set ); 35 } 2009 11 24
  • 15. • bad $result = { good => 0.092, bad => 0.996, }; 2009 11 24
  • 16. • • ao shi • 2009 11 24
  • 17. • 2009 11 24
  • 18. • 2009 11 24
  • 19. • 2009 11 24
  • 20. 2009 11 24
  • 21. 2009 11 24
  • 22. P-1 • 200 NG ( ) • 2009 11 24
  • 23. 3 2009 11 24
  • 24. 3 2009 11 24
  • 25. 2 2009 11 24
  • 26. 2 2009 11 24
  • 27. 1 2009 11 24
  • 28. 1 2009 11 24
  • 29. • Algorithm::NaiveBayes • mecab • yusukebe 2009 11 24
  • 30. 2009 11 24
  • 31. Algorithm::NaiveBayes • http://search.cpan.org/~kwilliams/Algorithm-NaiveBayes-0.04/ lib/Algorithm/NaiveBayes.pm • mecab wikipedia • http://deepneko.dyndns.org/kokotech/2009/06/ mecabwikipedia.html • Lingua::JA::Categorize • http://search.cpan.org/~miki/Lingua-JA-Categorize-0.01001/ lib/Lingua/JA/Categorize.pm 2009 11 24