Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AI in finance - from hype to marketing and cybersecurity use cases

864 views

Published on

We all love AI. But what about financial applications? It turns out that AI, and in particular ML and DL can be very effectively applied to financial services. In this presentation, Natalino will illustrate a number of use cases such as transaction fraud prevention and credit authorization using AI and machine learning techniques.

Starting from there, Natalino will show how those problems can be solved with AI techniques with code snippets and live demos using Keras, Tensorflow and Scikit-Learn applied to some financial datasets.

Natalino will take you on this AI for finance journey, describing how techniques such as deep learning, t-sne, dimensionality reduction can be used as the "data engines" for the next-gen financial applications both in retail as well as commercial banking. -

See more at: http://globalbigdataconference.com/santa-clara/global-artificial-intelligence-conference-83/speaker-details/natalino-busa-41434.html

Published in: Technology

AI in finance - from hype to marketing and cybersecurity use cases

  1. 1. 1 Natalino Busa - @natbusa Global Artificial Intelligence Conference AI in Finance: from Hype to marketing and cyber security use cases
  2. 2. www.globalbigdataconference.com Twitter : @bigdataconf Global Artificial Intelligence Conference AI in Finance: from Hype to marketing and cyber security use cases Natalino Busa Twitter : @natbusa
  3. 3. 3 Natalino Busa - @natbusa Cognitive Finance Group Advisory Board Member ING Group Enterprise Architect: Cybersecurity, Fintech Teradata Head of Applied Data Science Teradata Global Evangelist on Open Sourced Technologies O’Reilly Author and Speaker Philips Senior Researcher, Data Architect Linkedin and Twitter: @natbusa
  4. 4. 4 Natalino Busa - @natbusa What about AI in Finance?
  5. 5. 5 Natalino Busa - @natbusa The Medici Bank: Italian: Banco Medici 1397–1494
  6. 6. 6 Natalino Busa - @natbusa Data as a Relationship ● Trust ● Transparency of Use ● Customer First ● Regulations and Laws ● Respect and Protect ● Providing a Service
  7. 7. 7 Natalino Busa - @natbusa An ethical approach for Actionable Financial Data Help the customer Propose, Advise, Select, Filter, Connect, Simplify1. Protect the customer Detect, Prevent, Alert, Block, Defend, Identify, Authorize 2.
  8. 8. 8 Natalino Busa - @natbusa Personalized Financial
  9. 9. 9 Natalino Busa - @natbusa http://www.slideshare.net/ING/4q15-media ● Innovation helps to empower people to make better financial decisions. ING, has launched several new omni-channel banking platforms. ● The platform gives customers insights into their personal finances in an easy and intuitive way. Financial personalized recommenders
  10. 10. 10 Natalino Busa - @natbusa Financial personalized recommenders ● It Knows Finance ● Conversational ● Personal ● Actionable ● Predictive ● Reuse Existing Content
  11. 11. 11 Natalino Busa - @natbusa Inspiration from the Web
  12. 12. 12 Natalino Busa - @natbusa Credit Pre-Authorization
  13. 13. 13 Natalino Busa - @natbusa ● Fintech innovation to help strengthen our lending capabilities and better serve our consumer and SME clients. ● Kabbage, one of the leading US-based technology platforms providing automated lending to SME. ● In January 2016, ING has made an investment in fintech WeLab, which provides consumer loans in China and Hong Kong in a fully automated process that just takes minutes, from application to approval. http://www.slideshare.net/ING/4q15-media Strategic data-driven initiatives
  14. 14. 14 Natalino Busa - @natbusa Approaching (Almost) Any Machine Learning Problem - Abhishek Thakur, Kaggle Grandmaster - data labels raw data: tables, files Useful dataData munging Feature Engineering Tabular Data ready for ML
  15. 15. 15 Natalino Busa - @natbusa Input Hand Designed Program Input Input Rule-based System Output Hand Designed Features Mapping from features Output Learned Features Mapping from features Output Classic Machine Learning Input Learned Features Learned Complex features Output Mapping from features Representational Machine Learning Deep Learning (end-to-end learning) Prof. Yoshua Bengio - Deep Learning https://youtu.be/15h6MeikZNg Predictive API’s: How to get there?
  16. 16. 16 Natalino Busa - @natbusa From Feature to Architecture Engineering:
  17. 17. 17 Natalino Busa - @natbusa Demo: Credit Payment Defaulting with TensorFlow and Keras Methodology This research aimed at the case of customers default payments in Taiwan and compares the predictive accuracy of probability of default among six data mining methods. From the perspective of risk management, the result of predictive accuracy of the estimated probability of default will be more valuable than the binary result of classification - credible or not credible clients https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients
  18. 18. 18 Natalino Busa - @natbusa Step 0: data exploration Target variable: default payment next month Color scheme: yes, defaulting not defaulting g
  19. 19. 19 Natalino Busa - @natbusa Step 1: feature engineering pay_1 -1 pay_2 0 pay_3 -1 pay_4 0 pay_5 0 pay_6 0 pay_avgamt1 0.203221 pay_avgamt2 3.72718 pay_avgamt3 1.01611 pay_avgamt4 0.914495 pay_avgamt5 0.0700097 pay_avgamt6 0.0689935 pay_stdavgamt 1.40083 pay_avg -0.333333 pay_std 0.516398
  20. 20. 20 Natalino Busa - @natbusa Step 1: baseline (e.g regression) model = Sequential() model.add(Dense(1, input_shape=(input_dim,)) model.add(Activation('relu')) 1 87 it’s a neural network … with no network :)
  21. 21. 21 Natalino Busa - @natbusa Step 2: deep learning model = Sequential() model.add(Dense(256, input_shape=(input_dim,), activation='relu')) model.add(Dense(256, activation='relu')) model.add(Dropout(0.25)) model.add(Dense(64, activation='relu')) model.add(Dense(64, activation='relu')) model.add(Dropout(0.25)) model.add(Dense(64, activation='relu')) model.add(Dense(64, activation='relu')) model.add(Dropout(0.25)) model.add(Dense(10, activation='sigmoid')) model.add(Dense(1)) model.add(Activation('sigmoid')) 256 64 64 87 256 64 64 10 1
  22. 22. 22 Natalino Busa - @natbusa Step 3: compare: is deep learning better? 256 64 64 87 256 64 64 10 1 1 87 Shallow Logit Model Deep Learning
  23. 23. 23 Natalino Busa - @natbusa Step 4: picking the brain of our DL model 87 1
  24. 24. 24 Natalino Busa - @natbusa 256 64 64 87 256 64 64 10 1 Step 4: picking the brain of our DL model
  25. 25. 25 Natalino Busa - @natbusa Step 5: semantic clustering Default Very Safe Mixed Group Safe SafeMixed Group
  26. 26. 26 Natalino Busa - @natbusa Hands on with Keras and Tensorflow
  27. 27. 27 Natalino Busa - @natbusa Hyper-Parameters tuning - based on scikit-learn - 15 classifiers, - 14 feature preprocessing methods - 4 data preprocessing methods - 110 hyperparameters - Supervised classification challenge: 100 different datasets https://arxiv.org/abs/1611.03824v1 Natalino Busa - @natbusa
  28. 28. 28 Natalino Busa - @natbusa The API for banking data. Two levels: - Transactions - Risk Scoring Inspiration from the Web
  29. 29. 29 Natalino Busa - @natbusa Card Theft: Geo-Alerting
  30. 30. 30 Natalino Busa - @natbusa Clustering geolocated data using Spark and DBSCAN How to group users’ events using machine learning and distributed computing By Natalino Busa Predictive API’s: Clustering Geolocated Data
  31. 31. @natbusa | linkedin.com: Natalino Busa Venues and Events
  32. 32. @natbusa | linkedin.com: Natalino BusaEvents clustering
  33. 33. @natbusa | linkedin.com: Natalino Busa Card Theft/Cloning: DBSCAN and Convex Hulls
  34. 34. @natbusa | linkedin.com: Natalino Busa Fast writes 2D Data Structure Replicated Tunable consistency Multi-Data centers CassandraKafka Spark Streaming Events Distributed, Scalable Transport Events are persisted Decoupled Consumer-Producers Topics and Partitions Ad-Hoc Queries Joins, Aggregate User Defined Functions Machine Learning, Advanced Stats and Analytics Kafka+Cassandra+Spark: SMACK stack Streaming Machine Learning
  35. 35. @natbusa | linkedin.com: Natalino Busa Spark: Unified Distributed Computing: SQL + Machine Learning + Graph Analytics Spark - RDDs Streaming SQL MLlib Graphx Analytics, Statistics, Data Science, Model Training HDFS NoSQL SQL Data Sources Map-Reduce HDFS KAFKA Hive
  36. 36. @natbusa | linkedin.com: Natalino Busa Cassandra: Store all the data Spark: Analyze all the data DC1: replication factor 3 DC2: replication factor 3 DC3: replication factor 3 + Spark Executors Storage! Analytics! Data Spark and Cassandra: distributed goodness
  37. 37. @natbusa | linkedin.com: Natalino Busa Cassandra - Spark Connector Cassandra: Store all the data Spark: Distributed Data Processing Executors and Workers Cassandra-Spark Connector: Data locality, Reduce Shuffling RDD’s to Cassandra Partitions DC3: replication factor 3 + Spark Executors
  38. 38. 38 Natalino Busa - @natbusa Cyber security in Finance
  39. 39. 39 Natalino Busa - @natbusa Network Intrusion Detection It contains 130 million flow records involving 12,027 distinct computers over 36 days (not the full 58 days claimed for the entire data release). Each record consists of: time (to nearest second), duration, source and destination computer ids, source and destination ports, protocol, number of packets and number of bytes Techniques: TDA, Dimensionality Reduction https://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction
  40. 40. 40 Natalino Busa - @natbusa AI: tools and technologies
  41. 41. 41 Natalino Busa - @natbusa Tools for AI and Machine (deep) Learning … this are just a few examples ...
  42. 42. 42 Natalino Busa - @natbusa AI: models and algorithms
  43. 43. 43 Natalino Busa - @natbusa AI: an ensemble of analytical methods SQL + Graph + Text + Machine Learning + Voice/Image/Video
  44. 44. 44 Natalino Busa - @natbusa AI in Finance: Recap & Lessons Learned
  45. 45. 45 Natalino Busa - @natbusa Takeaways ● AI can be applied in Finance: YES ● Train your AI: Domain Experts + ML ● Use All Tools, All Data
  46. 46. 46 Natalino Busa - @natbusa Distributed computing Artificial Intelligence Machine Learning Statistics Big/Fast Data Streaming Computing Linkedin and Twitter: @natbusa

×