December 1, 2008
You can visualize data
mining as a
process of searching
for treasure buried in
the sand or digging up
rock to mine for gold -
thus 'mining', but the
tools we use
do it in a truly
The problem with data
and why you do care
By Anna Skountzou
You are ﬂooded with tons of knowledge hidden inside? Or consider as a statistical analysis.
data, day by day. how such an input would make Put aside the ﬂuffy terminology
Spreadsheets, reports, the difference for you and your and start thinking of rules and
surveys, you name it. But company? patterns, all illustrated via
you are neither the info We bet that, even if it has expositive writing and
junkie nor the number already crossed your mind, you visualization schemes, ﬁnally
cruncher. You actually have have neither the time nor the contributing the less biased and
no time to invest on it, right techniques to exploit this most valuable signals for your
while you know that you’re data. And, till now, you were decisions.
wasting precious insights forced to hire a connoisseur of And keep in mind that
and signiﬁcant potential. data analysis, something that latent knowledge presents both a
Despair? No. proved to be costly and time- hidden cost and a tremendous
The problem is simple. Too consuming. Yes, that was the best opportunity, but there is no need
much information in case, typically you just did to be stressed about that
spreadsheets or other formats, nothing. anymore, you may now just mine
laying around your hard disk or Well, till now. Our team not your knowledge!
in the cloud. Have you ever tried only relieves you of all these, but
to imagine the amount of also extends what you used to
MINEKNOWLEDGE December 1, 2008
By Manos Androulakis
Speaking of taking the most out of your data sets, “A miner with a mattock in his
we provide you with a rock solid solution. The hand is a very rough way to
process goes like this: conceptualize the complexity and
state-of-the-art of the
1. You send your data to us. processes we execute.
A diverse and extended set of
2. You sit back, breathe some fresh air and
exploration and filtering
enjoy every moment of your life in between.
algorithms, next to a variety of
learning and meta-learning
3. You open your inbox and receive the very
knowledge and secrets trapped in your data, techniques, are utilized, optimized
unveiled. and evaluated, while the problem is
a computationally intensive one
And here are a few more reasons on why to and demands a highly customized
select us. approach.
1. It is easy: Consultants, discussions, So we’re putting human
meetings. Forget them all. What you need to intelligence and our high expertise
do is just send us an email, with your data set in between of various advanced
attached. artificial intelligence algorithms, to
2. It is fast: Within a week, results and the finally provide you with the very
very knowledge of your data set will pop up in secrets trapped in your data,
your inbox. unveiled.”
3. It is fun: Honestly. Working with tons of
data is so much fun, especially when others do George Tziralis
all the work for you. 6. It is insightful: You already know that,
4. It is secure: Rest assured that we’ve we give you the most valuable insights on your
done our best to keep your data and mining data in return.
results safe and private (the latter may not be 7. It is affordable: The cost of a
valid in our free services). datamine.it analysis range, you either pay
5. It is clear: If statistics sound greek to nothing, or €500.
you, we’re speaking your language.
stand as the least
biased input to deci-
sion making, a pure
source of insights and
MINEKNOWLEDGE December 1, 2008
“Think of a simple process. Then, make it
simpler. Try to find the steps that are still
vague. Cut them off. Finally, ask your
grandma what she cannot understand.
This is how we make things happen in
By Eleftheria Kanavou
It’s simple. You just email us your data set, in balance sheet, you name it). And there are no limitations in
an .xls, .csv or .txt format. And then we take over. the number of columns and rows of your set.
Clear enough? Ok, that’s it. As long as you got the data
The ﬁle should in a form like the one in the ﬁgure. Let us set in that format, you just sent it to us with an email at
make it even more clear. go(AT)mineknowledge.com. We’ll reply with a conﬁrmation
of receiving an appropriate ﬁle, plus an invoice via paypal.
And, within a week, you’ll have a fully ﬂedged
mineknowledge report, waiting in your inbox. We think it’s
There are two pricing plans:
FREE: You can have the whole report for free, if your
Columns in the data set are attributes, like color, size, value data set is of less than 30 columns (attributes) and 300 rows
and purchase decision; whatever your data set is made of. (instances), plus you agree that we may publish the analysis in
Attributes may be numeric (numbers), or nominal (one or our blog. In this case, the report may take up to a month.
more words). You also need to deﬁne the ‘target’ attribute,
one or more characteristics you want us to focus on and MAX: No restrictions at all, delivery within a week, full
explain its behavior based on all the other attributes. ﬂedged analysis for a €500 ﬂat price.
You also may call each row an instance, a case, an
example, in other words a discrete set of values of each You may take a look at a typical datamine.it results
attribute (let’s say a person’s reply to a survey, a product’s report in our website, while we do wait for your data sets!
characteristics in a list of products, quarter updates in a
MINEKNOWLEDGE December 1, 2008
By Athina Pandi
You used to think column graphs and pies But, the question remains. Is that the most
as the most insightful views you could you can expect from a data set analysis? Have you
expect from a data analysis. You’ll actually gain deep insights from your data? The
probably change your mind. Let’s take a answer is a clear no. Let us show you why.
What follows is a set of rules that emerge from
look at a survey example.
a proper datamine.it analysis, even for a data set as
oversimpliﬁed as the above example. Try this
A simplistic one. Consider the data set
described in the previous page.
Let’s say it refers to answers gathered through
or, maybe this set of rules:
a survey, or stored in your enterprise database. A
typical analysis will ﬁnally come up with some
If color = yellow then buy = yes
graphs, like the ones following. If color = red then buy = yes
If color = white then buy = yes
If color = green then buy = no
If color = blue then buy = no
If color = black then buy = no
See the difference between the almost obvious
and the really insightful?
Go ﬁnd out more in a complete typical
mineknowledge report. Yes, this is what your own
data set will look like, just after a week. Still
considering it? Check out our blog for more case
studies. And send us your data, now.
And you’re probably used to consider the
analysis contributing a graph like the above as,
well, fruitful. Same for the following one.
MINEKNOWLEDGE December 1, 2008
A few more words
By Eirini Lygkoni
MineKnowledge is a group of young and passionate
data engineers, each of us holding an engineering
diploma from NTUA and an MSc or PhD in Applied
Math, Statistics or Operations Research. We
are located in Athens, Greece and London, UK.
• George Tziralis is clearly a data junkie who
lives on his mac. In the rare case he’s logged off, he
enjoys dancing tango and organizing Open Coffee
meetings around Greece. Apart from that, at 26 he is
a serial entrepreneur, while he also teaches a data
mining post-graduate course via blog and tries to ﬁnd
some time to write up his PhD Thesis on markets for
• Athina Pandi is -among datamine.it- on her
second MSc at Imperial College. Communications &
Signal Processing is her late interest, next to statistics,
data mining and networks. When she is ofﬂine, you
may ﬁnd her in a pub around Hyde park.
• Eleftheria Kanavou, with a strong tendency
in dancing, is the one who naturally gives rhythm to
the whole team. Her research interests include
stochastic processes and behavioral statistics, while
data mining is the physical outlet of her
• Manos Androulakis is the algo geek of the
team. The biggest the challenge and the data set, the
most determined he is for the next diamond to mine.
His expertise lies in the areas of statistical designs,
variable selection methods and medical applications,
under the prism of data mining of course.
• Eirini Lygkoni is the epitome of doing magic
under pressure. A multi-tasker by nature, she is data is our passion
and mining our joy
addicted to statistics and probabilities, while she We literally can’t wait
literally can’t wait for the next data set to arrive. At to put our hands on
your data; get ready
the same time, simplicity is her favorite word and to be impressed -or
socializing her selection of choice for her rare free even excited- from the
precious insights that
time. Enough said. you’ll receive in just a
• Lina Massou stands as the quiet power of the week.
team. With a strong background in information
theory and cryptography, she deﬁnitely is the one to
take good care of your data and come up with their
very knowledge, unveiled.
• Anna Skountzou excels at both statistics
research and ecological conscience. That said, she’s
deﬁnitely the one to look for, when you are looking at
extracting patterns to let you put your data into
much more efﬁcient use, their green footprint
• Iro Zacharidou is a true data nut, next to a
party animal, putting her deep statistical expertise
aside. If you wonder about the outcome of these MINEKNOWLEDGE
coming together, rest assured that the required Athens, Greece | London, UK
amount of persistence and professionalism to put email@example.com
your data into investigation will be largely outrun. http://www.mineknowledge.com