The document discusses the DATAIA Institute, a convergence institute in France focused on data science, artificial intelligence, and their societal impacts. The institute brings together over 130 affiliated researchers from 14 academic institutions. It aims to address 4 overarching challenges: machine learning and AI, data and knowledge, transparency and ethics, and data protection. The institute will conduct research, training, and partnerships with industry on topics like responsible and transparent AI, algorithmic bias, and data privacy.
AAMAS-2017 8-12 May, 2017, Sao Paulo, BrazilCharith Perera
Tim Baarslag, Alper Alan, Richard Gomer, Muddasser Alam, Charith Perera, Enrico Gerding and M.C. Schraefel, An Automated Negotiation Agent for Permission Management, Proceedings of the 16th International Conference on Autonomous Agents and Multi-agent Systems (AAMAS-2017) Sao Paulo, Brazil, May, 2017, Pages 380-390 (10).
"From Big Data to Smart data"
Jie (Jack) Yang, Associate Research Fellow, SMART Infrastructure Facility, presented a summary of his research as part of the SMART Seminar Series on 28 April 2016.
For more information, visit the event page at: http://smart.uow.edu.au/events/UOW212890.html.
Open Data Analytical Model for Human Development Index to Support Government ...Andry Alamsyah
The transparency nature of Open Data is beneficial for citizens to evaluate government work performance. In Indonesia, each government bodies or ministry have their own standard operation procedure on data treatment resulting in incoherent information between agent and likely to miss valuable insight. Therefore, our motivation is to show the advantage of Open Data movement to support unified government decision making. We use dataset from data.go.id which publish official data from each government bodies. The idea is by using those official but limited data, we can find important pattern. The case study is on Human Development Index value prediction and its clustered nature. We explore the data pattern using two important data analytics methods classification and clustering procedure. Data analytics is the collection of activities to reveal unknown data pattern. Specifically, we use Artificial Neural Network classification and K-means clustering. The classification objective is to categorize different level of Human Development Index of cities or region in Indonesia based on Gross Domestic Product, Number of Population in Poverty, Number of Internet User, Number of Labors and Number of Population indicators data. We determined which city belongs to four categories of Human Development stated by UNDP standard. The clustering objective is to find the group characteristics between Human Development Index and Gross Domestic Product.
My testimony to NSTAC (http://www.dhs.gov/national-security-telecommunications-advisory-committee) on the need for more research data in big data networking analysis, better taxonomies/ontologies, and the need for more accessible tools, given December 8, 2015. A very insightful, thoughtful group of people. The administration really got it right with this one.
Diffusion of Big Data and Analytics in Developing Countriestheijes
The purpose of this study is to shed light on the capabilities for storing, analysing and sharing big data in developing countries. The study takes an in-depth look at adoption of big data as a technological innovation, as well as the adoption issues for Big Data, its availability and access. The paper presents a review of academic literature, policy documents from international agencies and reports from industry in order to assess the diffusion and adoption of big data innovation in developing countries. The study was broadened by a Google Scholar search for relevant literature where the combinations of the following key words were used big data and analytics, developing countries, and diffusion of Innovations. Diffusion of innovations can greatly accelerate adoption and utilization of Big Data, even though there are challenges faced by developing countries which limit capability and utilization of these technologies effectively. The paper presents the Innovations Diffusions Theoretical framework for the study of Big Data innovation adoption in developing countries. The study concludes that the diffusion theory concepts provide an effective mechanism for policy leaders in developing countries to maximize adoption of Big Data innovations, and can also be used in informing policy implementers on how to increase adoption rates for Big Data.
Data Mining And Visualization of Large DatabasesCSCJournals
Data Mining and Visualization are tools that are used in databases to further analyse and understand the stored data. Data mining and visualization are knowledge discovery tools used to find hidden patterns and to visualize the data distribution. In the paper, we shall illustrate how data mining and visualization are used in large databases to find patterns and traits hidden within. In large databases where data is both large and seemingly random, mining and visualization help to find the trends found in such large sets. We shall look at the developments of data mining and visualization and what kind of application fields usage of such tools. Finally, we shall touch upon the future developments and newer trends in data mining and visualization being experimented for future use.
Data ethics and machine learning: discrimination, algorithmic bias, and how t...Data Driven Innovation
Machine learning and data mining algorithms construct predictive models and decision making systems based on big data. Big data are the digital traces of human activities - opinions, preferences, movements, lifestyles, ... - hence they reflect all human biases and prejudices. Therefore, the models learnt from big data may inherit all such biases, leading to discriminatory decisions. In my talk, I discuss many real examples, from crime prediction to credit scoring to image recognition, and how we can tackle the problem of discovering discrimination using the very same approach: data mining.
Crowdsourcing Approaches for Smart City Open Data ManagementEdward Curry
A wide-scale bottom-up approach to the creation and management of open data has been demonstrated by projects like Freebase, Wikipedia, and DBpedia. This talk explores how to involving a wide community of users in collaborative management of open data activities within a Smart City. The talk discusses how crowdsourcing techniques can be applied within a Smart City context using crowdsourcing and human computation platforms such as Amazon Mechanical Turk, Mobile Works, and Crowd Flower.
Mathematical Finance & Financial Data Science Seminar
AI and machine learning are entering every aspect of our life. Marketing, autonomous driving, personalization, computer vision, finance, wearables, travel are all benefiting from the advances in AI in the last decade. As more and more AI applications are being deployed in enterprises, concerns are growing about potential "AI accidents" and the misuse of AI. With increased complexity, some are questioning whether the models actually work! As the debate about fairness, bias, and privacy grow, there is increased attention to understanding how the models work and whether the models are thoroughly tested and designed to address potential issues.
The area "Responsible AI" is fast emerging and becoming an important aspect of the adoption of machine learning and AI products in the enterprise. Companies are now incorporating formal ethics reviews, model validation exercises, and independent algorithmic auditing to ensure that the adoption of AI is transparent and has gone through formal validation phases.
In this talk, Sri will introduce Algorithmic auditing and discuss why Algorithmic auditing will be a formal process industries using AI will need. Sri will also discuss the emerging risks in the adoption of AI and discuss how QuSandbox, his company is building, will address the emerging needs of formal Algorithmic auditing practices in enterprises.
phd research proposal should be written in such a way that it makes a positive and powerful first impression about your potential to become a good researcher and allows the university to assess whether you are a good match for the mentors or supervisors and their areas of research expertise.
Check out the scope for future research proposal topics in big data 2023 - https://rb.gy/6yoy0
Unveiling Tomorrow_ The Future of Data Science.pdfCIOWomenMagazine
In this exploration, we delve into the burgeoning realm of data science, examining the current state, anticipating future trends, and understanding the transformative potential that lies ahead.
AAMAS-2017 8-12 May, 2017, Sao Paulo, BrazilCharith Perera
Tim Baarslag, Alper Alan, Richard Gomer, Muddasser Alam, Charith Perera, Enrico Gerding and M.C. Schraefel, An Automated Negotiation Agent for Permission Management, Proceedings of the 16th International Conference on Autonomous Agents and Multi-agent Systems (AAMAS-2017) Sao Paulo, Brazil, May, 2017, Pages 380-390 (10).
"From Big Data to Smart data"
Jie (Jack) Yang, Associate Research Fellow, SMART Infrastructure Facility, presented a summary of his research as part of the SMART Seminar Series on 28 April 2016.
For more information, visit the event page at: http://smart.uow.edu.au/events/UOW212890.html.
Open Data Analytical Model for Human Development Index to Support Government ...Andry Alamsyah
The transparency nature of Open Data is beneficial for citizens to evaluate government work performance. In Indonesia, each government bodies or ministry have their own standard operation procedure on data treatment resulting in incoherent information between agent and likely to miss valuable insight. Therefore, our motivation is to show the advantage of Open Data movement to support unified government decision making. We use dataset from data.go.id which publish official data from each government bodies. The idea is by using those official but limited data, we can find important pattern. The case study is on Human Development Index value prediction and its clustered nature. We explore the data pattern using two important data analytics methods classification and clustering procedure. Data analytics is the collection of activities to reveal unknown data pattern. Specifically, we use Artificial Neural Network classification and K-means clustering. The classification objective is to categorize different level of Human Development Index of cities or region in Indonesia based on Gross Domestic Product, Number of Population in Poverty, Number of Internet User, Number of Labors and Number of Population indicators data. We determined which city belongs to four categories of Human Development stated by UNDP standard. The clustering objective is to find the group characteristics between Human Development Index and Gross Domestic Product.
My testimony to NSTAC (http://www.dhs.gov/national-security-telecommunications-advisory-committee) on the need for more research data in big data networking analysis, better taxonomies/ontologies, and the need for more accessible tools, given December 8, 2015. A very insightful, thoughtful group of people. The administration really got it right with this one.
Diffusion of Big Data and Analytics in Developing Countriestheijes
The purpose of this study is to shed light on the capabilities for storing, analysing and sharing big data in developing countries. The study takes an in-depth look at adoption of big data as a technological innovation, as well as the adoption issues for Big Data, its availability and access. The paper presents a review of academic literature, policy documents from international agencies and reports from industry in order to assess the diffusion and adoption of big data innovation in developing countries. The study was broadened by a Google Scholar search for relevant literature where the combinations of the following key words were used big data and analytics, developing countries, and diffusion of Innovations. Diffusion of innovations can greatly accelerate adoption and utilization of Big Data, even though there are challenges faced by developing countries which limit capability and utilization of these technologies effectively. The paper presents the Innovations Diffusions Theoretical framework for the study of Big Data innovation adoption in developing countries. The study concludes that the diffusion theory concepts provide an effective mechanism for policy leaders in developing countries to maximize adoption of Big Data innovations, and can also be used in informing policy implementers on how to increase adoption rates for Big Data.
Data Mining And Visualization of Large DatabasesCSCJournals
Data Mining and Visualization are tools that are used in databases to further analyse and understand the stored data. Data mining and visualization are knowledge discovery tools used to find hidden patterns and to visualize the data distribution. In the paper, we shall illustrate how data mining and visualization are used in large databases to find patterns and traits hidden within. In large databases where data is both large and seemingly random, mining and visualization help to find the trends found in such large sets. We shall look at the developments of data mining and visualization and what kind of application fields usage of such tools. Finally, we shall touch upon the future developments and newer trends in data mining and visualization being experimented for future use.
Data ethics and machine learning: discrimination, algorithmic bias, and how t...Data Driven Innovation
Machine learning and data mining algorithms construct predictive models and decision making systems based on big data. Big data are the digital traces of human activities - opinions, preferences, movements, lifestyles, ... - hence they reflect all human biases and prejudices. Therefore, the models learnt from big data may inherit all such biases, leading to discriminatory decisions. In my talk, I discuss many real examples, from crime prediction to credit scoring to image recognition, and how we can tackle the problem of discovering discrimination using the very same approach: data mining.
Crowdsourcing Approaches for Smart City Open Data ManagementEdward Curry
A wide-scale bottom-up approach to the creation and management of open data has been demonstrated by projects like Freebase, Wikipedia, and DBpedia. This talk explores how to involving a wide community of users in collaborative management of open data activities within a Smart City. The talk discusses how crowdsourcing techniques can be applied within a Smart City context using crowdsourcing and human computation platforms such as Amazon Mechanical Turk, Mobile Works, and Crowd Flower.
Mathematical Finance & Financial Data Science Seminar
AI and machine learning are entering every aspect of our life. Marketing, autonomous driving, personalization, computer vision, finance, wearables, travel are all benefiting from the advances in AI in the last decade. As more and more AI applications are being deployed in enterprises, concerns are growing about potential "AI accidents" and the misuse of AI. With increased complexity, some are questioning whether the models actually work! As the debate about fairness, bias, and privacy grow, there is increased attention to understanding how the models work and whether the models are thoroughly tested and designed to address potential issues.
The area "Responsible AI" is fast emerging and becoming an important aspect of the adoption of machine learning and AI products in the enterprise. Companies are now incorporating formal ethics reviews, model validation exercises, and independent algorithmic auditing to ensure that the adoption of AI is transparent and has gone through formal validation phases.
In this talk, Sri will introduce Algorithmic auditing and discuss why Algorithmic auditing will be a formal process industries using AI will need. Sri will also discuss the emerging risks in the adoption of AI and discuss how QuSandbox, his company is building, will address the emerging needs of formal Algorithmic auditing practices in enterprises.
phd research proposal should be written in such a way that it makes a positive and powerful first impression about your potential to become a good researcher and allows the university to assess whether you are a good match for the mentors or supervisors and their areas of research expertise.
Check out the scope for future research proposal topics in big data 2023 - https://rb.gy/6yoy0
Unveiling Tomorrow_ The Future of Data Science.pdfCIOWomenMagazine
In this exploration, we delve into the burgeoning realm of data science, examining the current state, anticipating future trends, and understanding the transformative potential that lies ahead.
e-SIDES workshop at BDV Meet-Up, Sofia 14/05/2018e-SIDES.eu
The following presentation was given at the workshop "Technology solutions for privacy issues: what is the best way forward?" organized by e-SIDES at the BDVe Meet-up in Sofia on May 14, 2018. The workshop, chaired by Gabriella Cattaneo from IDC, involved stakeholders from ICT-18 projects.
Presentation of Nozha Boujemaa (Dr Inria) on Trusworthy Artificial Intelligence including Responsible and Robust Artificial Intelligence - MIT Tech Review Innovation Leaders Summit "Breakthrough to Impact", Paris November 30th 2018
Ethics and Responsible AI Deployment
Abstract: As Artificial Intelligence (AI) becomes more prevalent, protecting personal privacy is a critical ethical issue that must be addressed. This article explores the need for ethical AI systems that safeguard individual privacy while complying with ethical standards. By taking a multidisciplinary approach, the research examines innovative algorithmic techniques such as differential privacy, homomorphic encryption, federated learning, international regulatory frameworks, and ethical guidelines. The study concludes that these algorithms effectively enhance privacy protection while balancing the utility of AI with the need to protect personal data. The article emphasises the importance of a comprehensive approach that combines technological innovation with ethical and regulatory strategies to harness the power of AI in a way that respects and protects individual privacy.
Artificial intelligence (AI) has the potential to significantly impact employment, social equity, and economic systems in ways that require careful ethical analysis and aggressive legislative measures to mitigate negative consequences. This means that the implications of AI in different industries, such as healthcare, finance, and transportation, must be carefully considered.
Due to the global nature of AI technology, global collaboration must be fostered to establish standards and regulatory frameworks that transcend national boundaries. This includes the establishment of ethical guidelines that AI researchers and developers worldwide should follow.
To address emergent ethical concerns with AI, future research must focus on several recommendations. Firstly, ethical considerations must be integrated into the design phase of AI systems and not treated as an afterthought. This is known as "Ethics by Design" and involves incorporating ethical standards during the development phase of AI systems to ensure that the technology aligns with ethical principles.
Secondly, interdisciplinary research that combines AI, ethics, law, social science, and other relevant domains should be promoted to produce well-rounded solutions to ethical dilemmas. This requires the participation of experts from different fields to identify and address ethical issues.
Thirdly, regulatory frameworks must be dynamic and adaptive to keep pace with the rapid evolution of AI technologies. This means that regulatory frameworks must be flexible enough to accommodate changes in AI technology while ensuring ethical standards are maintained.
Fourthly, empirical research should be conducted to understand the real-world implications of AI systems on individuals and society, which can then inform ethical principles and policies. This means that empirical data must be collected to understand how AI affects people in different contexts.
Finally, risk assessment procedures should be improved to better analyse the ethical hazards associated with AI applications.
e-SIDES workshop at ICE-IEEE Conference, Madeira 28/06/2017e-SIDES.eu
On June 28, the e-SIDES team members made a presentation of the project at the ICE/IEEE Conference 2017 in Madeira. The workshop "Societal and Ethical Challenges in the Era of Big Data: Exploring the emerging issues and opportunities of big data management and analytic" welcomed a high-level international academic and government audience, such as professors and researchers, to present the initial analysis of the key challenges.
Big Data Analytics : Understanding for Research ActivityAndry Alamsyah
Big Data Analytics Presentation at International Workshop Colloquium Exploring Research Opportunity. School of Business and Management (SBM) - ITB. Bandung, 8 August 2019.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Monitoring Java Application Security with JDK Tools and JFR Events
DATAIA & TransAlgo
1. Data Science, Intelligence & Society
March 2018
DATAIA Institute
Data Science, Artificial Intelligence & Society
Nozha Boujemaa
Director at DATAIA Institute
Research Director at Inria
nozha.boujemaa@inria.fr
2. Data Science, Intelligence & Society
Aim of convergence Institutes
• Structuration of few centres gathering multidisciplinary scientific task forces with large
scale and high visibility in order to reach major challenges, at the crossroads of
societal and economic challenges and questions from the scientific community.
• Advanced research-training integration.
• Effective coupling with the socio-economic world –industry partnership.
• DATAIA is the Convergence Institute in Data Science, AI & Society gathering 130
affiliated researchers and targeting 300 within 3 years, Kick-off => 15 February 2018
3. Data Science, Intelligence & Society
DATAIA Institute
• 4 Overarching Challenges:
o From Machine Learning to Artificial Intelligence,
o From Data to Knowledge, from Data to Decision,
o Transparency, Responsible AI & Ethics,
o Data Protection, Regulation and Economy
• Scientific and disciplinary foundations: Math, Computer Sciences, Management and Economy,
Social Sciences, Legal Sciences
• Application domains: Internet of people and things, Urbanization 4.0 & Mobility, Optimal Energy
Management, Business Analytics, Health, Well being & personal nutrition, e-Sciences.
• Roadmap for 8 years, 10M€ -180 M€ Global Budget, with 14 academic founding institutions
• Kick-off => February 15th 2018
Data Science, Intelligence & Society
4. Data Science, Intelligence & Society
Les membres fondateurs
• L’Institut DATAIA est porté par l’Université Paris-Saclay et dirigé par le centre de
recherche Inria Saclay – Île-de-France :
• Le consortium rassemble des Universités, des Instituts de recherche nationaux
et des Grandes Ecoles :
5. Data Science, Intelligence & Society
Industrial Affiliation Program
• Contributions: research support, data and use cases
• Participation in the definition, selection and monitoring of programs
• Participation in defining the long-term strategic vision
• Workshops, S&T work exchange sessions, brainstorming sessions (open problems), etc
• IP will follow the rules defined in a consortium agreement
• First look at IP.
Based of what is done in American Universities (Stanford model)
*
*
Data Science, Intelligence & Society
6. Data Science, Intelligence & Society
• Alan Turing (UK)
• IVADO (Canada)
• Advanced Core Technologies for Big Data
Integration (Japan)
• DSI (Data Science Institute – Columbia University)
International partners
7. Data Science, Intelligence & Society
Data & Algorithms
« 2 sides of the same coin »
• Rising benefits from Big Data and AI technologies have wide impact on our economy and
social organization ;
• Transparency and trust of such Algorithmic Systems (data & algorithms) becoming
competitiveness factors for Data-driven economy ;
• Data analytics is changing from description of past to predictive and prescriptive analytics for
decision support ;
• Importance of remedying the information asymmetry between the producer of the digital
service and its consumer, be it citizen or professional – B2C or B2B => civil rights, competition,
sovereignty.
8. Data Science, Intelligence & Society
Algorithmic systems in every day life
• Some dominant platforms on the market play a role of "prescriber”
by directing a large share of user traffic:
• Ranking mechanisms (search engine),
• Recommendation mechanisms and content selection
• Product or service recommendation: is it most appropriate for the consumer
(personalization) or the most appropriate to the seller (given the stock)?
• Opacity of the use made of the personal data and how they are processed,
• What about the consent? Is it always respected? Mobilitics CNIL-Inria (Privatics)
• Credit scoring, how fair is it?
• Predictive justice?
⇒ New discrimination between those who know how algorithms work ad who do not
In addition to economical and geostrategic effects on persons and societies
9. Data Science, Intelligence & Society
Algorithmic Systems Bias
Mastering Big Data Technologies: Bias problems could impact data technologies
accuracy and people’s lives
Challenges 1: Data Inputs to an Algorithm
o Poorly selected data
o Incomplete, incorrect, or outdated data
o Data sets that lack disproportionately represent certain populations
o Malicious attack
Challenges 2: The Design of Algorithmic Systems and Machine Learning
o Poorly designed matching systems
o Unintentional perpetuation and promotion of historical biases
o Decision-making systems that assume correlation implies causation
10. Data Science, Intelligence & Society
Challenges / Efforts
• It is a mistake to assume they are objective simply because they are data-driven
• Algorithms are encapsulated opinions through decision parameters and learning data
• Mastering the accuracy and robustness of Big Data & AI techniques: bias, reproducibility,
source of unintentional discrimination
• Implementing the “Transparent-by-design”: fairness/equity, loyalty, neutrality, etc.
• Interdisciplinary co-conception of solutions, How responsible is a ML algorithm?
• Interdisciplinary training of Data Scientists: law, sociology and economy, Careful software reuse
=> mastering information leaks (SRE)
AI is part of the solution and not only the law!
Transparency Tools vs GDPR vs Having the Choice
11. Data Science, Intelligence & Society
Transparent-by-design, auditable-by-design, fairness & non-discrimination-by-
design
§ Explainability, reproducibility & robustness of ML,
§ Data provenance and usage monitoring
§ Progressive user-centric analytics (Mix of Dataviz and Analytics)
§ New paradigms for information flow monitoring
§ Fact-checking requiring explicit & verifiable integration of heterogeneous
data sources
Challenges / Efforts
12. Data Science, Intelligence & Society
Challenges / Efforts
• Complex concepts, Dependent on cultural context, law context, etc.
International collaboration is key
Transparency, Asymmetry, Accountability, Loyalty, Fairness, Equity, Intelligibility, Explainability, Traceability,
Auditability, Proof and Certification, Performance, Ethics, Responsibility
Ethical ≠ Responsible, Transparent ≠ Make available the source code
• Pedagogy and explanation, awareness, uses-cases, (all public! Including scientists)
• Auditability and Building Transparent-by-Design tools and algorithms
ML algorithms are shared in open-source but NOT Data (governance of AS!)
13. Data Science, Intelligence & Society
Interdisciplinary challenges
• From Machine Learning to Artificial Intelligence
o Innovative machine learning and AI: common sense, adaptability, generalization
o Deep learning and adversarial learning
o Machine learning and hyper-optimization
o Optimization for learning, stochastic gradient method improvements, Bayesian
optimization, combinatorial optimization
o Link between learning and modelling, integration of a priori into learning
o Repeatability and robust learning
o Statistical Inference and Validation
o Composition of deep architectures
14. Data Science, Intelligence & Society
Interdisciplinary challenges
• From Data to Knowledge, from Data to Decision
o Heterogeneous, semi-structured, complex, incomplete and/or uncertain data
o Fast big data: new methodologies to use data
o Online learning, methodology for massive data, efficient methods
o Improved storage, calculation and estimation for data science
o Modeling of interactions between agents (human or artificial) by game theory
o Multiscale and multimodal representation and algorithms
o Theoretical analysis of heuristic methods (complexity theory, information geometry, Markov
chain theory)
o Human-machine co-evolution in autonomous systems: conversational agents, autonomous
systems , social robots
15. Data Science, Intelligence & Society
Interdisciplinary challenges
• Transparency & digital trust
o Responsibility-by-design, Explicability-by-design
o Transparency-by-design, equity-by-design
o Audit of algorithmic systems: non-discrimination, loyalty, technical bias, neutrality, fairness
o Measuring digital trust and ownership
o Progressive user-centric-analytics (interactive monitoring of decision systems: dataviz,
dashboards, IHM)
o Responsibility for information processing and decision-making: data usage control and fact-
checking
o Causal discovery, traceability of inferences from source data, interpretability of deep
architectures
16. Data Science, Intelligence & Society
Interdisciplinary challenges
• Data protection, regulation and economy
o "Privacy-by-design", GDPR
o Distributed Machine Learning preserving privacy
o Development of ethically responsible methodologies and technologies to
regulate the collection, use and process of personal data, and the
exploitation of the knowledge derived from this data.
o Computer security of data processing chains
o Security/crypto: block-chain and trusted third parties
17. Data Science, Intelligence & Society
Training and research
• Three doctoral trainings of the Université Paris-Saclay : EDMH, ED STIC & ED SHS.
• Reinforce the math-info crossover in data sciences training, new interdisciplinary
curricula more open to SHS: awareness of the responsibility of algorithmic systems,
economic models, rights and uses of data.
• Research Projets– 3 years, 2 thesis scholarships (or 1PhD + 1 Post-Doc/engineer).
• International student mobility (incoming and outgoing) with 2 thesis scholarships
(excellence scholarships) per year.
• Thematic Semesters for MSc / PhD /E-C, Biennial Conference, Annual Self-Assessment
Symposium, Workshops, Challenges, Junior Conference, Summer-school.
Data Science, Intelligence & Society
18. Data Science, Intelligence & Society
Co-working
• Workspaces are available for teams affiliated to the DATAIA Institute in the Alan Turing
building, an emblematic venue :
o 1800 sqm of which approximately 300 sqm for the new teams
o 8 teams on site
o 800 sqm of meeting spaces
• Implementation of telepresence screens in progress.
19. Data Science, Intelligence & Society
• National Scientific Platform for Transparency &
Accountability Tools and Methods for Data and
Algorithms (Fairness, Neutrality, Loyalty); B2B &
B2C.
• Support of The new “Law for Digital Republic”: the
right to the explainability of algorithmic decision of
public services (APB service stopped!)
• Contributors: CNNum, DGCCRF besides academia
(Grenoble, Paris, Lille, Rennes etc), industries and
associations,
20. Data Science, Intelligence & Society
Objectives:
o Resource center, Empowerment tools: reports, publications,
software, controlled data sets & testing protocols ;
o Awareness rising: workshops & Moocs ;
o Best practices recommendation & sharing ;
o Research & Dev. Programs.
Working Groups :
o Auditability of Recommendation and Ranking systems ;
o Explainability, Reproducibility and Bias of ML ;
o Privacy, Data Usage Control & Information-flow-monitoring ;
o Influence, Nudging, Fact-ckecking.
21. Data Science, Intelligence & Society
Merci de votre attention
Science des données, Intelligence & Société
Need for Interdiscplinary efforts
THANK YOU
nozha.boujemaa@inria.fr
Data Science, Intelligence & Society
22. Data Science, Intelligence & Society
Summer School
• DATAIA Institute co-organizes the DS3
Summer School with École polytechnique :
o Speakers confirmed: Cédric Villani, Yann
Le Cun, Adrian Weller, Krishna Gummadi,
Jean-Philippe Vert …
o Format: plenary and parallel sessions on
several sites
o Attendees: between 400 and 500
participants (students, academics and
professionals)
DATA SCIENCE SUMMER SCHOOL
TUTORIALS ON
Deep Learning
Yann LECUN [Facebook - New York University]
Interpretable Machine Learning
Adrian WELLER [University of Cambridge - Alan Turing Institute]
Fairness in Machine Learning
Krishna GUMMADI [Max Planck Institute]
Probabilistic Numerical Methods
Mark GIROLAMI [Imperial College London]
Online Learning Algorithms
Nicolò CESA-BIANCHI [University of Milano]
Non-convex Optimization
Suvrit SRA [MIT] ... other speakers will be confirmed soon
PARALLEL SESSIONS
on Health and Social Sciences
PRACTICAL SESSIONS
on Deep Learning, Reinforcement
Learning, Recommender Systems, Precision Medicine...
POSTER SESSION
ROUND TABLE DISCUSSION
Targeted for students, academics and professionals
More information to come on:
www.ds3-datascience-polytechnique.fr
JUNE
25-29
2018
at Campus
polyteChnique
OPENING by Cédric VILLANI
23. Data Science, Intelligence & Society
France-Japan Symposium
• The DATAIA Institute co-organize with JST a France-Japan workshop
on Deep Learning and Artificial Intelligence, in partnership with the
French Embassy in Japan and the Ministry of Higher Education,
Research and Innovation (MESRI)
o Dates: from 11 -12 July, 2018
o Location: Amphitheatre of MESRI
o Format: Plenary sessions
o Attendees: between 150 and 200 participants (academics and
professionals)
o With the winners of the CREST Program (equivalent to ERC
senior) funded by the JST
Data Science, Intelligence & Society