A Corpus Linguistics Based Approach for Estimating Arabic Online Content<br />
5,340,000<br />
1,950,000<br />
0.5 %<br />
1 %<br />
1.4 %<br />
3 %<br />
1.4 %<br />0.5 %<br />1 %<br />3 %<br />
Zipff’s Law<br />
Corpora<br />Building<br />
Dmozcorpus<br />75,560 pages<br />530.1 MB<br />659,756uniq. words <br />
Wikipedia corpus<br />95,140 pages<br />213.3 MB<br />760,690uniq. words <br />
CCA corpus<br />377 pages<br />82,878 uniq. words <br />
Common<br />
A Corpus Linguistics Based Approach for Estimating Arabic Online Content
A Corpus Linguistics Based Approach for Estimating Arabic Online Content
A Corpus Linguistics Based Approach for Estimating Arabic Online Content
A Corpus Linguistics Based Approach for Estimating Arabic Online Content
A Corpus Linguistics Based Approach for Estimating Arabic Online Content
A Corpus Linguistics Based Approach for Estimating Arabic Online Content
A Corpus Linguistics Based Approach for Estimating Arabic Online Content
Upcoming SlideShare
Loading in …5
×

A Corpus Linguistics Based Approach for Estimating Arabic Online Content

1,068 views

Published on

Published in: Technology, Education
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,068
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

A Corpus Linguistics Based Approach for Estimating Arabic Online Content

  1. 1. A Corpus Linguistics Based Approach for Estimating Arabic Online Content<br />
  2. 2.
  3. 3.
  4. 4. 5,340,000<br />
  5. 5. 1,950,000<br />
  6. 6. 0.5 %<br />
  7. 7. 1 %<br />
  8. 8. 1.4 %<br />
  9. 9. 3 %<br />
  10. 10. 1.4 %<br />0.5 %<br />1 %<br />3 %<br />
  11. 11.
  12. 12. Zipff’s Law<br />
  13. 13.
  14. 14. Corpora<br />Building<br />
  15. 15.
  16. 16. Dmozcorpus<br />75,560 pages<br />530.1 MB<br />659,756uniq. words <br />
  17. 17. Wikipedia corpus<br />95,140 pages<br />213.3 MB<br />760,690uniq. words <br />
  18. 18. CCA corpus<br />377 pages<br />82,878 uniq. words <br />
  19. 19.
  20. 20.
  21. 21. Common<br />

×