Presentation made at 1st International Conference on Artificial Intelligence Applications in Environmental, Social, and Governance organized by Indian Institute of Management, Bangalore.
Abstract— Many environmental remediation and energy applications (conversion and storage) for sustainability need design and development of green novel materials. Discovery processes of such novel materials are time taking and cumbersome due to large number of possible combinations and permutations of materials structures. Often theoretical studies based on Density Functional Theory (DFT) and other theories, coupled with Simulations are conducted to narrow down sample space of candidate materials, before conducting laboratory- based synthesis and analytical process. With the emergence of artificial intelligence (AI), AI techniques are being tried in this process too to ease out simulation time and cost. However tremendous values of previously published research from various parts of the world are still left as labor-intensive manual effort and discretion of individual researcher and prone to human omissions. AIMS-EREA is our novel framework to blend best of breed of Material Science theory with power of Generative AI to give best impact and smooth and quickest discovery of material for sustainability. This also helps to eliminate the possibility of production of hazardous residues and bye-products of the reactions. AIMS-EREA uses all available resources - Predictive and Analytical AI on large collection of chemical databases along with automated intelligent assimilation of deep materials knowledge from previously published research works through Generative AI. We demonstrate use of our own novel framework with an example, how this framework can be successfully applied to achieve desired success in development of thermoelectric material for waste heat conversion.
New learning technologies seem likely to transform much of science, as they are already doing for many areas of industry and society. We can expect these technologies to be used, for example, to obtain new insights from massive scientific data and to automate research processes. However, success in such endeavors will require new learning systems: scientific computing platforms, methods, and software that enable the large-scale application of learning technologies. These systems will need to enable learning from extremely large quantities of data; the management of large and complex data, models, and workflows; and the delivery of learning capabilities to many thousands of scientists. In this talk, I review these challenges and opportunities and describe systems that my colleagues and I are developing to enable the application of learning throughout the research process, from data acquisition to analysis.
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...Sergey Sosnovsky
As textbooks evolve into digital platforms, they open a world of opportunities for Artificial Intelligence in Education (AIED) research. This paper delves into the novel use of textbooks as a source of high-quality labeled data for automatic keyword extraction, demonstrating an affordable and efficient alternative to traditional methods. By utilizing the wealth of structured information provided in textbooks, we propose a methodology for annotating corpora across diverse domains, circumventing the costly and time-consuming process of manual data annotation. Our research presents a deep learning model based on Bidirectional Encoder Representations from Transformers (BERT) fine-tuned on this newly labeled dataset. This model is applied to keyword extraction tasks, with the model’s performance surpassing established baselines. We further analyze the transformation of BERT’s embedding space before and after the fine-tuning phase, illuminating how the model adapts to specific domain goals. Our findings substantiate textbooks as a resource-rich, untapped well of high-quality labeled data, underpinning their significant role in the AIED research landscape.
Data Science Tools and Technologies: A Comprehensive Overviewsaniakhan8105
"Data Science Tools and Technologies: A Comprehensive Overview" explores the essential tools and platforms that data scientists use to analyze, visualize, and interpret complex data. From programming languages like Python and R to advanced frameworks like TensorFlow and Hadoop, this guide covers everything needed for effective data science practice.
New learning technologies seem likely to transform much of science, as they are already doing for many areas of industry and society. We can expect these technologies to be used, for example, to obtain new insights from massive scientific data and to automate research processes. However, success in such endeavors will require new learning systems: scientific computing platforms, methods, and software that enable the large-scale application of learning technologies. These systems will need to enable learning from extremely large quantities of data; the management of large and complex data, models, and workflows; and the delivery of learning capabilities to many thousands of scientists. In this talk, I review these challenges and opportunities and describe systems that my colleagues and I are developing to enable the application of learning throughout the research process, from data acquisition to analysis.
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...Sergey Sosnovsky
As textbooks evolve into digital platforms, they open a world of opportunities for Artificial Intelligence in Education (AIED) research. This paper delves into the novel use of textbooks as a source of high-quality labeled data for automatic keyword extraction, demonstrating an affordable and efficient alternative to traditional methods. By utilizing the wealth of structured information provided in textbooks, we propose a methodology for annotating corpora across diverse domains, circumventing the costly and time-consuming process of manual data annotation. Our research presents a deep learning model based on Bidirectional Encoder Representations from Transformers (BERT) fine-tuned on this newly labeled dataset. This model is applied to keyword extraction tasks, with the model’s performance surpassing established baselines. We further analyze the transformation of BERT’s embedding space before and after the fine-tuning phase, illuminating how the model adapts to specific domain goals. Our findings substantiate textbooks as a resource-rich, untapped well of high-quality labeled data, underpinning their significant role in the AIED research landscape.
Data Science Tools and Technologies: A Comprehensive Overviewsaniakhan8105
"Data Science Tools and Technologies: A Comprehensive Overview" explores the essential tools and platforms that data scientists use to analyze, visualize, and interpret complex data. From programming languages like Python and R to advanced frameworks like TensorFlow and Hadoop, this guide covers everything needed for effective data science practice.
Data science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to extract insights and knowledge from structured and unstructured data.
For More Details Visit: https://datamites.com/data-science-course-training-chennai/
Data science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to extract insights and knowledge from structured and unstructured data.
For More Details Visit: https://datamites.com/data-science-course-training-chennai/
Object-Oriented Database Model For Effective Mining Of Advanced Engineering M...cscpconf
Materials have become a very important aspect of our daily life and the search for better and
new kind of engineered materials has created some opportunities for the Information science
and technology fraternity to investigate in to the world of materials. Hence this combination of
materials science and Information science together is nowadays known as Materials
Informatics. An Object-Oriented Database Model has been proposed for organizing advanced engineering materials datasets.
Data Science Certification in Pune-JanuaryDataMites
Data science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to extract insights and knowledge from structured and unstructured data.
For More Details Visit: https://datamites.com/data-science-course-training-pune/
Data Science Certification in Pune-JanuaryDataMites
Data science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to extract insights and knowledge from structured and unstructured data.
For More Details Visit: https://datamites.com/data-science-course-training-pune/
Top cited articles 2020 - Advanced Computational Intelligence: An Internation...aciijournal
Advanced Computational Intelligence: An International Journal (ACII) is a quarterly open access peer-reviewed journal that publishes articles which contribute new results in all areas of computational intelligence. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on advanced computational intelligence concepts and establishing new collaborations in these areas.
Data science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to extract insights and knowledge from structured and unstructured data.
For More Details Visit: https://datamites.com/data-science-course-training-chennai/
Data science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to extract insights and knowledge from structured and unstructured data.
For More Details Visit: https://datamites.com/data-science-course-training-chennai/
Object-Oriented Database Model For Effective Mining Of Advanced Engineering M...cscpconf
Materials have become a very important aspect of our daily life and the search for better and
new kind of engineered materials has created some opportunities for the Information science
and technology fraternity to investigate in to the world of materials. Hence this combination of
materials science and Information science together is nowadays known as Materials
Informatics. An Object-Oriented Database Model has been proposed for organizing advanced engineering materials datasets.
Data Science Certification in Pune-JanuaryDataMites
Data science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to extract insights and knowledge from structured and unstructured data.
For More Details Visit: https://datamites.com/data-science-course-training-pune/
Data Science Certification in Pune-JanuaryDataMites
Data science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to extract insights and knowledge from structured and unstructured data.
For More Details Visit: https://datamites.com/data-science-course-training-pune/
Top cited articles 2020 - Advanced Computational Intelligence: An Internation...aciijournal
Advanced Computational Intelligence: An International Journal (ACII) is a quarterly open access peer-reviewed journal that publishes articles which contribute new results in all areas of computational intelligence. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on advanced computational intelligence concepts and establishing new collaborations in these areas.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
1. AIMS-EREA
A framework for “AI-accelerated Innovation of Materials for
Sustainability - for Environmental Remediation and Energy
Applications”
An interdisciplinary study and application of Generative AI and Materials Science
2. AIMS-EREA TEAM
Sudarson Roy Pratihar
sudarson@symphonyai.com
Dr Manaswita Nag
drmanaswitanag@gmail.com
Deepesh Pai
deepesh.pai@symphonyai.com
3. Table of content
Results and Discussion
Conclusion
Sustainable Material
Development & AI
AIMS EREA Methodology
5. Sustainability = “Meeting the needs of the
present without compromising the ability of
future generations to meet their own needs.”
United Nations
6. ● Long years for the conventional discovery of
materials take
● Interdisciplinary complex problem set
● Materials science, thermodynamics, DFT, …
● Huge corpus of growing papers
● Growing open scientific databases such as
OQMD, MGI, MAPI
● Programming to process data , APIs and papers
Novel materials
are key to energy
applications and
environmental
remediation
Challenges in Novel Materials Development
Existing materials suffer from low efficiency, toxicity and cost
7. Current Growing Silos….
Large language models
Rich source of theoretical results
+ models
• Scientific knowledge
• Scientific decisioning
• Scientific workflow
• Scientific analysis
Scientific Structure Knowledge LLMs
Human Expertise
OpenSource to leverage LLM
Large Text Knowldge Lang Chain
Deep Reasoning, Code
Generation
Rich Source of
Theoretical Results, API
Science Mind & Brain
9. REa
REa
Reasoning
API
AI
ML
MGI, OPTIMADE, …
Feed additional knowledge
Define target and instruct specifics
W
o
r
k
f
l
o
w
Integrate Specialized tools for new tech
Integrate Specialized AI/ML models
Automated web and
specialized searches
Develop novel thermoelectric for
waste heat conversion at power
plant
Thoughts: zT > 1, T>900K,
thermodynamic stable, high oxidation
resistance, low cost, non toxic,…
Result
Recipe for batch
execution
Intelligence
AIMS-EREA As a Framework with 2 Personas
10. REa
REa
API
AI
ML
W
o
r
k
f
l
o
w
AIMS-EREA
Reasoning
Vector
DB
Chunk
embeddings with
Metadata
Unstructured Knowledge ingestion (a)
Define custom instructions in natural
language (b)
Instructions to find
suitable for
thermoelectric ….
Using my instruction
set, develop novel
thermoelectric for …
Develop workflow based on
instruction set and LLM’s
own knowledge
Workflo
w
Filter structured knowledge
bases with criteria and
merge
Now tap unstructured source
to enrich
AI model to infer zT and
PF
Potential novel
candidates
MAPI OQMD AIMS DB
AIMS-EREA Core (c)
Typical execution (d)
Structured Knowledge
Get
instructio
n set
Some properties
still missing
Apply criteria and rank
Implemented Architecture
11. Major Building Blocks of AIMS-EREA
Reasoning and Intelligent Agent Knowledge Ingestion
Leveraging Neuro Symbolic Architectures
– ReAct, MRKL, Plan-n-Solve
Structured API + Unstructured Text
Workflow Pluggability of Tools
Leveraging Scientist’s intelligence to
instruct RIA
Future proofing and extensibility
RIA
ToolSets
13. ● Ability to discover correct instructions and
comprehend
● Right deisioning and selection of tools
● Correctness of each output step (without
hallucinations)
● Final output
Validation and Observations
Thermo-Electric Material Discovery – As validation
14.
15. Material zT ¯
Temperature
(K)
Recommendation
Ca2ZrTiO6 4.4 500K
Thermodynamic stability to be checked
Sr0.09Ba0.11Yb0.05Co4Sb1
2
1.6 800K
Possible toxicity due to Sb
n-type nano-structured
SiGe
1.3 700K
Bulk form widely used. Novelty to be
studied.
MgTa2O6 1.1 1200K
RESULTS OF THERMOELECTRIC MATERIAL DISCOVERY
17. Final thoughts
Opening up vast capability by linking
the strength of multiple disciplines
0.1
Has potential to set up
guidance for discovery and
synthesis
0.2
Can be extended in various
areas of material discovery
0.3
18. CREDITS: This presentation template was created by Slidesgo, including
icons by Flaticon, infographics & images by Freepik
THANKS!
Do you have any questions? sudarson@symphonyai.com
+91 9632203793
www.linkedin.com/in/sudarson