1. In CNSwe are developingaholisticartificial intelligence cybersystem.The systemwill be composed
of sevenintertwinedmodules/elementsincluding;Memory,Sensation,Perception,Reasoning,
Thought,Consciousness,DecisionMakingandAction.
We are well aware the taskis monumental andimprobable andthat’sexactlythe point.InCNSwe
believethatalongthe processof tryingto make the impossiblepossible,we must;stretchourminds
to theirlimit,thinkunlike anyone else,be original &inventive anddevice solutionsthatwill
challenge the boundariesof whatwe know.
Case study:Big Data
Big Data on a tea spoon:
Big data isa set of approaches,methodsandtoolsthatrequire new waystouncovercritical hidden
informationfromlarge datasetsof massive scale.Bigdatausuallyincludesdatasetswithsizes
beyondthe abilityof commonlyusedtoolstoprocessandanalyse the datawithina practical and
acceptable leadtime.Bigdataisgrowingfast,since 2012 data grew fromtensof terabytesto
petabytestoday.
Challengesof BigData:
The keyproblemof BigData is;that it’sgrowingfasterthanMore's law for computationspeed.This
problemwill onlygetworse inyearstocome inparticularwiththe nextgenerationof challenges
such as; gene sequencers,NMRimaging,social media,the internetof everythingandfuture
unknowns.It’simportanttonote,thatwhendealingwithBigData,there are twocrucial challenges;
The firstis identifyrobustmethodstoextractcritical neededinformationfromthe BigDatasetor to
put thisinLehmanterms,findinganeedle inahaystack.The secondchallenge is,todevelop
solutionsthatwill enable fastcomputationof BigData and inparticular,whendatais growingfaster
than the computationrate.To deal withthese defieseffectively,anyproposedsolutionsandtools
shouldbe able totransformBig Data setsintoSmall Data setswhile retainingall the relevant
informationandideallyeliminatingdatanoise.
How to transformBigData setsintoSmall Data setswhile retainingall the information
One of the mosteffective andwell establishedapproachestodeal withBigDatais knownas
“Statistics”. A good representative sample of the BigDataset,in conjunctionwiththe correctuse of
statistical methodsandtools,are capable toextractvital informationtoanswerourquestions,
withinaconfidence level andmarginof error.
But whathappenswhenstatisticsare notthe appropriate approachor the typologyof the problem
isnot suitedforstatistical methods?
We inCNShave developedagroundbreakingmethodandthe toolswhichundercertainconditions
(Ill-problems) canreduce BigDatasize by the square root of the data set dimension(i.e.asetof
10^9 data recordsis reducedto~10^3) enablingtovaporize the haystack(BigDataset) while leaving
the needle (Information) intactandfree of noise.The innovative methodandtoolshave beentested
and the proof of concepthas beenestablished.The mathematical approachandproposed
algorithmsproduce informationreconstructionsof greaterqualitythananyotherexistingmethod,
2. but at a cost of convergence time (oneoff).Howeveronce the datahas beentransformed,the
manipulationandanalysistime isreducedsignificantly,ourexperimental resultsshoweda reduction
inprocessingtime bya factor of 50. Anotherpronouncedbenefitof thisapproachisthe abilityto
reconstructthe informationwithahighlevel of quality&completeness,regardlessof the data
structure or data size (greatnewsforcloudcomputing).Althoughwe have achievednotable results
inthe testssofar, additional experimentsare plannedtofurthersolidifythe validationof this
innovative andbreakthroughapproach.
For our tests,we useddatafrom NMR experiments,andwere consistentlyable toreduce the
original datasetsfroman average of 750Gb to an average of 0.045 Gb a factor ~10^3 withoutlossof
information,whileeliminatingthe datanoise.Atpresent,we are workingatimprovingthe method
and furtherreduce the datasetssize evenfurther.A paperwiththe preliminaryresultswillbe
publishedbyendof Julythisyear.