In 2010 we had the idea to have multiple graduation projects with common themes. The themes selected for that year were "Arabic NLP" and "Pen computing". This presentation outlined the two themes and suggested several project ideas for them (and some GP ideas not related to the two themes),
2. Themes?
Multiple teams work on different, but related,
ideas.
Why?
So that they don't be alone in unexplored territory.
Can help each other, or supervisors can help a
large group of people.
Can continue the themes over the years.
Can plant the seeds for research groups in FCIS.
3. My two suggested themes
Arabic NLP, serving research in the Arabic
language.
Awraq device, a tablet project that I plan to
eventually produce.
4. Ideas for Arabic NLP?
Previous FCIS projects in NLP, Arabized.
Projects oriented towards Applications.
Projects oriented towards Research.
Projects oriented towards Theory.
5. Previous NLP projects, Arabized
Translation from English to Arabic (2004) →
Make it Arabic to English.
Text Summarization
Natural language to UML
Natural language to mind maps
Document classification/ clustering
6. Arabic Applications
Semantic AdSense-like platform
Understand what the web page is about before
serving advertisements.
If semantic is too hard, see making it contextual.
"Animate my story"
To encourage children to express themselves in
Arabic (a modern version of Sakhr's 1986 program)
Arabic spelling checker or grammar checker.
7. Research in Arabic NLP
Text mining/ abstraction over text
Gain desired information from text, without having
to read all of it. e.g:
− Monitoring public opinion from collections of newspaper
articles.
− Extracting a list of scientific topics from research papers
“This paper discusses the following plant diseases ....”
− Extracting summaries of publications
− Extracting references and citations from publications, like
Citeceer or Google scholar.
It would be good to choose a specific area and
focus on it, since general text abstraction could
be too large a project.
8. Research in Arabic NLP
Performing إعرابon (some types of) Arabic
sentences.
Could use existing wordlists and tree banks or
create our own limited (but well-designed) data
sets.
Extracting semantic information from sentence
structure...
What's the difference between ","ملا ٌديزيد ٌ المنطلق
""المنطلق ملا ٌديزيد", "ملا ٌديزيد ينطلق
ً
What's the difference between جاءني مسرعا, جاءني
وهو مسرع، جاءني يسرع
9. Research in Arabic NLP
From natural language to formal language:
A form of natural language-based programming
A form of natural language-based knowledge
representation (e.g represent things like expert
system rules or prolog facts and rules in natural
language).
10. Arabic NLP all in one page
Translation, A2E.
Arabic text summarization
Arabic → UML
Arabic → Mind maps
Document classification/ clustering
Semantic or contexual AdSense
Animate my story
Spelling or grammar checker
Text mining (newspapers, books, topics in publication content, publication
references, other practical areas...)
إعراب انواع من الجمل
Semantic information from sentence structure
NLP based programming or knowledge representation
11.
12. Awraq project ideas
There are two types of suggested Awraq
projects:
Awraq technology: research for creating the
device itself.
Awraq applications: useful programs that run on
Awraq-like tablet devices.
13. Technology: Pen-based input
Arabic handwriting recognition
It's an active research topic
Previous projects usually have limitations on how the input is
formed
Can understand محمدbut not محمد
Enforce horizontal lines
Can have problems with dots or dَiacrِicْticْs.
Let's remove some limitations ☺
An additional feature: entering formatted text (i.e recognize font
size, underlines, alignment...etc)
Advanced idea: recognize font ()...نسخ، رقعة
Required techniques: Pattern recognition, image processing.
14. Technology: pen-based editing
How does the user edit when he has no cursor,
no keyboard, no mouse?
We should invent an interaction model beyond
the keyboard/mouse.
The project would be two parts:
1- Creating a user interaction model
2- Implementing it on a pen device
Techniques involved: User interaction design
(HCI), pattern recognition.
15. Technology: New types of applications
Web applications like Google docs are
replacing traditional desktop applications.
We want Awraq application to use web
technologies (HTML,CSS, JS...), but not miss
important features like copy/paste of rich media,
drag and drop, saving to files, OLE...etc
So the project is "A runtime environment for
hybrid applications that combine desktop
and web technologies"
Standards are being developed in many of
these areas, and the project could use these
standards where possible.
16. A sketch-recognition language
In 2005, an MIT researcher called Tracy
Hammond worked on a language called
LADDER, which can be used to describe any
sketch to be recognized and edited.
Hammond's publications are here:
http://srlweb.cs.tamu.edu/srlng/people/users/thammond?
papers=all#Papers
(search for LADDER).
She's working on more advanced languages
right now for similar goals
17. A sketch-recognition language
This is an interesting goal; it would enable
Awraq developers to easily add sketch
recognition to their own applications, without
needed to be CS researchers.
And there is a growing interest in the area of
sketch recognition (MIT design rationale, TAMU,
others...) so there is a lot of literature to read.
The projects is "A language for defining
elements for sketch recognition" (needs better
title!)
Techniques needed: Compilers, Pattern
recognition
18. A sketch-recognition language
From "Enabling Instructors to Develop Sketch
Recognition Applications for the Classroom",
again by Dr Hammond:
Scenario of the Future:
The class is Computability; the instructor is teaching
finite state machines (FSMs) today. Writing on a
SmartBoard behind her, she explains how an FSM
works by drawing one into a sketch recognition
system that she built before class in less than a half-
hour.
19. Technology: Other projects
Improving the Awraq operating system kernel:
Improving battery consumption...
Improving graphics speed on low processors
Working on a managed OS (e.g in JOS (Java) or
Singularity, Cosmos (C#) )
Improving networking features
e.g adding ad-hoc newtworking for instant
wireless classrooms.
Techniques: Operating systems, graphics,
networking
20. Applications: Design rationale
“Design Rationale" is a research group in MIT,
a lot of their work is related to design with
sketches.
One of the important people there is Randell
Davis. He's the mind behind a lot of stuff.
They have a lot of publications on their work
(hint: we can use them as references :D)
http://rationale.csail.mit.edu/
The MIT D/R group is not the only participant.
Other universities are working on the same
goals.
The following slides are about all of them...
21. Applications: Design rationale
So, what did they work on?
Improving Sketch recognition technology
e.g Separating text from drawings. See paper by
Akhshay Bhat (from TAMU)
LADDER
Applying Sketch recognition to many domains...
Electronic circuit diagrams
UML diagrams
Physics (see ASSIST video)
…
Multimodal interaction; e.g combining speech with sketching
22. Applications: Design rationale
This is a general project idea:
Pick a certain kind of diagram
Make your project "Working with _______
diagrams on a pen-computing device"
Add domain-specific features to your application
(e.g generating code from UML, simulating
electronic circuits, analyzing equations...)
For more scienceness, if you find a certain
technique to be promising, call your project "Using
____ for working with _____ diagrams on a pen-
computing device"
Techniques used: Pattern recognition, _______
26. Applications : 3D Sketching
http://www.dgp.toronto.edu/~shbae/ilovesketch.htm
27. Other applications
Applications that use hardware peripherals:
Sound-based keyboard with 2 mics.
Improving hand-tracking or eye-tracking with
video camera.
Applications that enhance the "school textbook/
copybook" experience
A pen-based word processor that organizes
ideas, not just formats text.
Advanced copybook (how?)
28. Awraq: all in two pages
Technology
Arabic handwriting recognition
Gestures for pen-based editing
Hybrid desktop/web applications development
system
Sketch-recognition language (like LADDER)
Improved OS kernel (Battery consumption,
Graphics speed, Managed OS)
Ad-hoc networking
29. Awraq: all in two pages
Applications
Design rationale style:
Computer architecture (decoders, adders...)
Sketch-based form editor
Sketch-based HTML editor
Sketchcode
3D Sketching
Related peripherals
Sonic keyboard w/mics
Hand tracking
Better tools for writing and education
Word processor that organizes ideas
Future textbook, copybook
30. Other ideas: Education
Automatic location and recognition (OCR) of
material on whiteboards from lecture videos.
Automatic lecture transcription (speech to text).
Can add extra features:
e.g Editing the video by editing text.
e.g Separating lecturer's voice from student questions
Automatic marking of programming assignments
Automatic detection of plagiarism (for code
assignments or in general).
31. Other ideas: Development tools
Code instrumentation (programs automatically
display real-time info about the call stack, value
of variables...)
Back-in-time debugging for .net
Automated GUI testing (like Selenium for web)
Continuations-based web applications
VoiceCode (sister of SketchCode) add-in for an
IDE
Automated A/B testing for .Net
32. Other ideas: Development tools
Wiki-based programming
Can develop blogs, message boards,...etc directly
by editing an empty web page from your web
browser
Kitty: programming mobile devices from the
mobile device itself
Take care of limitations like small screen, hard to
type symbols from KB...
33. Other ideas: Development tools
Completely visual programming language:
Thyrd
Subtext and its successor, Coherence
Visula
Google AppInventor
http://www.csse.monash.edu.au/~marriott/publications-web.html
Marten (http://andescotia.com/)
Sprog (sprog.sf.net)
Aluminum (aluminium.sf.net)
Thyrd and Subtext/Coherence seem promising
34. Other ideas: Development tools
Virtual reality building block programming
...Or other types of VR
visualization (e.g networks)
35. Other ideas: Search and knowledge
Search engine visualization (e.g Kartoo)
Search engine clustering (e.g yippy.com)
36. Other ideas in one page
Education
Recognition of writing on whiteboards
Transcription of spoken lectures
Automatic marking of code assignments
Automatic detection of plagiarism
Development tools
Wiki-based programming
Kitty; mobile-hosted mobile programming language
Completely visual programming language
Virtual reality building-block programming
Search
Search engine visualization
Search engine clustering