The document discusses genetic algorithms and their application to inverting the radiative transfer equation (RTE). Genetic algorithms are an optimization technique inspired by biological evolution, involving reproduction with recombination and mutation of "chromosomes" representing potential solutions. For inverting the RTE, atmospheric parameters can be encoded in binary strings called genotypes, and a merit function evaluates the fitness of solutions by comparing synthetic and observed spectra. The genetic algorithm technique explores the search space of possible atmospheres without needing an initial guess, in order to find the model that best reproduces the observed spectra.
This document outlines an agenda and activities for session 3 of a leadership series on current and effective teaching strategies. The session will involve teachers checking in, reporting on strategies they tried, and discussing common themes and questions. They will then work in groups to discuss how they have applied principles of universal design for learning and backwards design in their classrooms. The session aims to help teachers design lesson sequences that support all learners and develop plans to collaborate with others.
The document summarizes information about the Society of Women Engineers (SWE) Houston Section. It discusses SWE's core beliefs of integrity, respect, mutual support, and professional excellence. It also outlines SWE's goals of providing opportunities for women in engineering and embracing diversity. Additionally, it provides details about SWE Region C in Texas, the over 300 members of SWE Houston Section, and their past yearly events and scholarships. Finally, it discusses benefits of SWE membership for both businesses and individuals.
This document provides an overview of advertising, design, and internet services including look and feel design, copywriting, photography, animation, web development, eCommerce solutions, search optimization, content management, and more. It lists relevant technologies and tools and highlights sample client projects in various industries. The document shares positive client feedback and details work in branding, press ads, brochures, 3D imagery, and photography.
This document outlines the phase evaluation process for graduate students in the College of Education. The phases assess mastery of a trademark outcome (TO) through applied assignments and assessments. Phase 1 focuses on knowledge acquisition, Phase 2 on applying skills, and Phase 3 on real-world evaluation and implementation. Students in the Educational Diagnostician program must demonstrate skills in collaborative consultation, assessment planning, and evaluating interventions. Progress is tracked through course assignments, end of phase assessments, and rubric-based evaluations to ensure mastery of the TO before graduation.
May, 2011
Staff spent the first hour in school groups discussing their reading and writing assessment data, then the remainder of the day as a group, focused on Reading Next, AFL and literacy strategies across the grades and curriculum.
This document outlines an agenda and activities for session 3 of a leadership series on current and effective teaching strategies. The session will involve teachers checking in, reporting on strategies they tried, and discussing common themes and questions. They will then work in groups to discuss how they have applied principles of universal design for learning and backwards design in their classrooms. The session aims to help teachers design lesson sequences that support all learners and develop plans to collaborate with others.
The document summarizes information about the Society of Women Engineers (SWE) Houston Section. It discusses SWE's core beliefs of integrity, respect, mutual support, and professional excellence. It also outlines SWE's goals of providing opportunities for women in engineering and embracing diversity. Additionally, it provides details about SWE Region C in Texas, the over 300 members of SWE Houston Section, and their past yearly events and scholarships. Finally, it discusses benefits of SWE membership for both businesses and individuals.
This document provides an overview of advertising, design, and internet services including look and feel design, copywriting, photography, animation, web development, eCommerce solutions, search optimization, content management, and more. It lists relevant technologies and tools and highlights sample client projects in various industries. The document shares positive client feedback and details work in branding, press ads, brochures, 3D imagery, and photography.
This document outlines the phase evaluation process for graduate students in the College of Education. The phases assess mastery of a trademark outcome (TO) through applied assignments and assessments. Phase 1 focuses on knowledge acquisition, Phase 2 on applying skills, and Phase 3 on real-world evaluation and implementation. Students in the Educational Diagnostician program must demonstrate skills in collaborative consultation, assessment planning, and evaluating interventions. Progress is tracked through course assignments, end of phase assessments, and rubric-based evaluations to ensure mastery of the TO before graduation.
May, 2011
Staff spent the first hour in school groups discussing their reading and writing assessment data, then the remainder of the day as a group, focused on Reading Next, AFL and literacy strategies across the grades and curriculum.
This document summarizes a presentation on engaging students given by Faye Brownlie. It discusses various frameworks for engagement, including giving students voice and choice in assignments. Examples are provided of teachers who incorporated more student choice into their lessons, which increased engagement and understanding. Strategies presented include backwards design, formative assessment, and incorporating movement and collaboration into science lessons on electricity and atoms. The overall message is that providing opportunities for student choice and active learning can boost engagement.
Day 1 of a 2 day conversation about leading the learning in literacy. What counts in assessment? How does assessment reflect your values? Formative and summative.
This document discusses nostalgia and the mother through the analysis of various artworks depicting motherhood as well as nostalgic objects. It presents works by artists such as Mary Cassatt, Kitagawa Utamaro, Fra Bartolommeo, Cindy Sherman, and Barbara Kruger that portray the mother visually or through symbolic representation. Nostalgic objects mentioned include souvenirs, heirlooms, and antiques. The document provides a critical analysis of the cultural symbolism and representation of "the mother".
This document summarizes the features of the Ciphone CECT i68+ mobile phone. It is a dual SIM phone with 4GB of memory, FM radio, and support for applications like Java, MSN, office software and PDF files. Additional features include a 3.2 inch LCD screen, 2MP camera, MP3/MP4 music and video playback, Bluetooth, and games that can be downloaded to the memory card. The phone has basic call and messaging capabilities as well as an alarm, calendar, voice recorder and other utilities.
A Leadership Series: Current and Effective Teaching Strategies across the Curriculum.
Day 1 of a leadership series for intermediate and secondary teachers interested in improving practice for all students and in increasing collaboration in schools.
This document discusses considerations for staff development in secondary schools. It outlines elements of effective teacher leadership, including purpose, community, learning and teaching, courage, sustained action, and positive cultures. It also discusses essential components of lessons, such as essential questions, differentiation, assessment for learning, and gradual release of responsibility. The document provides examples of structures that can support staff development, such as learning teams, and reflects on factors that contribute to meaningful professional learning experiences for teachers.
Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchic...Martin Pelikan
This study focuses on the problem of finding ground states of random instances of the Sherrington-Kirkpatrick (SK) spin-glass model with Gaussian couplings. While the ground states of SK spin-glass instances can be obtained with branch and bound, the computational complexity of branch and bound yields instances of not more than about 90 spins. We describe several approaches based on the hierarchical Bayesian optimization algorithm (hBOA) to reliably identifying ground states of SK instances intractable with branch and bound, and present a broad range of empirical results on such problem instances. We argue that the proposed methodology holds a big promise for reliably solving large SK spin-glass instances to optimality with practical time complexity. The proposed approaches to identifying global optima reliably can also be applied to other problems and they can be used with many other evolutionary algorithms. Performance of hBOA is compared to that of the genetic algorithm with two common crossover operators.
This document summarizes research on enhancing gene expression programming (GEP) for Reynolds-averaged Navier-Stokes equations turbulence modeling with unsupervised clustering. It presents a GEP-enhanced multi-model framework that uses feature selection, dimensionality reduction, and clustering to assign different turbulence models to distinct regions of a flow, improving simulation accuracy. Results show the approach produces more accurate mean velocities and Reynolds stresses for a body-of-revolution testcase compared to baseline and GEP-driven models. Ongoing work includes optimizing the framework configuration and extending it to 3D domains.
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...Xavier Llorà
A byproduct benefit of using probabilistic model-building genetic algorithms is the creation of cheap and accurate surrogate models. Learning classifier systems---and genetics-based machine learning in general---can greatly benefit from such surrogates which may replace the costly matching procedure of a rule against large data sets. In this paper we investigate the accuracy of such surrogate fitness functions when coupled with the probabilistic models evolved by the x-ary extended compact classifier system (xeCCS). To achieve such a goal, we show the need that the probabilistic models should be able to represent all the accurate basis functions required for creating an accurate surrogate. We also introduce a procedure to transform populations of rules based into dependency structure matrices (DSMs) which allows building accurate models of overlapping building blocks---a necessary condition to accurately estimate the fitness of the evolved rules.
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...Atsushi Nitanda
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors
The document presents two main results:
1) Stochastic Gradient Descent (SGD) achieves linear convergence rates for expected classification error under a strong low noise condition. The number of iterations needed for an epsilon solution is O(log(1/epsilon)).
2) Averaged SGD (ASGD) achieves even faster linear convergence rates under the same condition, requiring O(log(1/epsilon)) iterations for an epsilon solution.
The results improve upon prior work by showing faster-than-sublinear convergence rates for more suitable loss functions like logistic loss. Toy experiments demonstrate the theoretical findings.
This document discusses optimization techniques for deep learning models, including stochastic gradient descent and data preprocessing. Stochastic gradient descent trains models faster than traditional gradient descent by using mini-batches of data, and often leads to better generalization. Data should be preprocessed by centering and normalizing inputs to aid optimization. Mini-batches should be shuffled and contain a mix of classes to improve training.
Linkage Learning for Pittsburgh LCS: Making Problems TractableXavier Llorà
Presentation by Xavier Llorà, Kumara Sastry, & David E. Goldberg showing how linkage learning is possible on Pittsburgh style learning classifier systems
The document discusses linear regression and maximum likelihood estimation in EViews. It provides an overview of the linear regression model and ordinary least squares (OLS) estimation. It then discusses how to perform OLS regression and test linear restrictions in EViews. Finally, it introduces maximum likelihood estimation and how it provides an alternative framework for estimation when the data is assumed to follow a particular probability distribution.
- This document discusses how data mining techniques have been successfully applied to various NASA data sources to predict software quality metrics like effort and defects.
- However, despite clear success, most of the data sources used in these examples have been discontinued or are no longer accessible.
- The main challenge is organizational rather than technological - how can we ensure the valuable lessons learned from this work are not wasted?
Slide for Arithmer Seminar given by Dr. Daisuke Sato (Arithmer) at Arithmer inc.
The topic is on "explainable AI".
"Arithmer Seminar" is weekly held, where professionals from within and outside our company give lectures on their respective expertise.
The slides are made by the lecturer from outside our company, and shared here with his/her permission.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
Compressed Sensing using Generative Modelkenluck2001
This document summarizes Kenneth Emeka Odoh's presentation on using generative models for compressed sensing. It introduces compressed sensing and describes how generative models like variational autoencoders (VAEs) and generative adversarial networks (GANs) can be used to solve the underdetermined linear systems that arise in compressed sensing. The document also compares VAEs to standard autoencoders and discusses how generative models may require fewer training examples due to regularization imposing structure on the problem.
PR-297: Training data-efficient image transformers & distillation through att...Jinwon Lee
안녕하세요 TensorFlow Korea 논문 읽기 모임 PR-12의 297번째 리뷰입니다
어느덧 PR-12 시즌 3의 끝까지 논문 3편밖에 남지 않았네요.
시즌 3가 끝나면 바로 시즌 4의 새 멤버 모집이 시작될 예정입니다. 많은 관심과 지원 부탁드립니다~~
(멤버 모집 공지는 Facebook TensorFlow Korea 그룹에 올라올 예정입니다)
오늘 제가 리뷰한 논문은 Facebook의 Training data-efficient image transformers & distillation through attention 입니다.
Google에서 나왔던 ViT논문 이후에 convolution을 전혀 사용하지 않고 오직 attention만을 이용한 computer vision algorithm에 어느때보다 관심이 높아지고 있는데요
이 논문에서 제안한 DeiT 모델은 ViT와 같은 architecture를 사용하면서 ViT가 ImageNet data만으로는 성능이 잘 안나왔던 것에 비해서
Training 방법 개선과 새로운 Knowledge Distillation 방법을 사용하여 mageNet data 만으로 EfficientNet보다 뛰어난 성능을 보여주는 결과를 얻었습니다.
정말 CNN은 이제 서서히 사라지게 되는 것일까요? Attention이 computer vision도 정복하게 될 것인지....
개인적으로는 당분간은 attention 기반의 CV 논문이 쏟아질 거라고 확신하고, 또 여기에서 놀라운 일들이 일어날 수 있을 거라고 생각하고 있습니다
CNN은 10년간 많은 연구를 통해서 발전해왔지만, transformer는 이제 CV에 적용된 지 얼마 안된 시점이라서 더 기대가 크구요,
attention이 inductive bias가 가장 적은 형태의 모델이기 때문에 더 놀라운 이들을 만들 수 있을거라고 생각합니다
얼마 전에 나온 open AI의 DALL-E도 그 대표적인 예라고 할 수 있을 것 같습니다. Transformer의 또하나의 transformation이 궁금하신 분들은 아래 영상을 참고해주세요
영상링크: https://youtu.be/DjEvzeiWBTo
논문링크: https://arxiv.org/abs/2012.12877
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryWai Nwe Tun
This document presents a system for human detection in UAV imagery using HOG descriptors calculated for superpixels extracted via SLIC clustering. The system extracts superpixels, calculates HOG descriptors for each, and classifies the descriptors using an AdaBoost classifier. Evaluation on two datasets shows the superpixel-based approach achieves 3% higher accuracy than the original HOG method while requiring more computational time for training.
Вычислительный эксперимент в молекулярной биофизике белков и биомембранIlya Klabukov
Вычислительный эксперимент в молекулярной биофизике белков и биомембран
Ефремов Роман Гербертович, доктор физико-математических наук, профессор, заместитель директора по науке Института биоорганической химии имени академиков М.М.Шемякина и Ю.А.Овчинникова РАН, руководитель Лаборатории моделирования биомолекулярных систем
synbio2012.ru
1) The document proposes a cardinality-constrained k-means clustering approach to address practical challenges with standard k-means, such as skewed clustering and sensitivity to outliers.
2) It formulates the problem as a mixed integer nonlinear program (MINLP) and provides a convex relaxation to the problem using semidefinite programming (SDP).
3) The approach provides optimality guarantees and a rounding algorithm to recover an integer feasible solution. Numerical experiments demonstrate competitive performance versus heuristics.
This document summarizes a presentation on engaging students given by Faye Brownlie. It discusses various frameworks for engagement, including giving students voice and choice in assignments. Examples are provided of teachers who incorporated more student choice into their lessons, which increased engagement and understanding. Strategies presented include backwards design, formative assessment, and incorporating movement and collaboration into science lessons on electricity and atoms. The overall message is that providing opportunities for student choice and active learning can boost engagement.
Day 1 of a 2 day conversation about leading the learning in literacy. What counts in assessment? How does assessment reflect your values? Formative and summative.
This document discusses nostalgia and the mother through the analysis of various artworks depicting motherhood as well as nostalgic objects. It presents works by artists such as Mary Cassatt, Kitagawa Utamaro, Fra Bartolommeo, Cindy Sherman, and Barbara Kruger that portray the mother visually or through symbolic representation. Nostalgic objects mentioned include souvenirs, heirlooms, and antiques. The document provides a critical analysis of the cultural symbolism and representation of "the mother".
This document summarizes the features of the Ciphone CECT i68+ mobile phone. It is a dual SIM phone with 4GB of memory, FM radio, and support for applications like Java, MSN, office software and PDF files. Additional features include a 3.2 inch LCD screen, 2MP camera, MP3/MP4 music and video playback, Bluetooth, and games that can be downloaded to the memory card. The phone has basic call and messaging capabilities as well as an alarm, calendar, voice recorder and other utilities.
A Leadership Series: Current and Effective Teaching Strategies across the Curriculum.
Day 1 of a leadership series for intermediate and secondary teachers interested in improving practice for all students and in increasing collaboration in schools.
This document discusses considerations for staff development in secondary schools. It outlines elements of effective teacher leadership, including purpose, community, learning and teaching, courage, sustained action, and positive cultures. It also discusses essential components of lessons, such as essential questions, differentiation, assessment for learning, and gradual release of responsibility. The document provides examples of structures that can support staff development, such as learning teams, and reflects on factors that contribute to meaningful professional learning experiences for teachers.
Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchic...Martin Pelikan
This study focuses on the problem of finding ground states of random instances of the Sherrington-Kirkpatrick (SK) spin-glass model with Gaussian couplings. While the ground states of SK spin-glass instances can be obtained with branch and bound, the computational complexity of branch and bound yields instances of not more than about 90 spins. We describe several approaches based on the hierarchical Bayesian optimization algorithm (hBOA) to reliably identifying ground states of SK instances intractable with branch and bound, and present a broad range of empirical results on such problem instances. We argue that the proposed methodology holds a big promise for reliably solving large SK spin-glass instances to optimality with practical time complexity. The proposed approaches to identifying global optima reliably can also be applied to other problems and they can be used with many other evolutionary algorithms. Performance of hBOA is compared to that of the genetic algorithm with two common crossover operators.
This document summarizes research on enhancing gene expression programming (GEP) for Reynolds-averaged Navier-Stokes equations turbulence modeling with unsupervised clustering. It presents a GEP-enhanced multi-model framework that uses feature selection, dimensionality reduction, and clustering to assign different turbulence models to distinct regions of a flow, improving simulation accuracy. Results show the approach produces more accurate mean velocities and Reynolds stresses for a body-of-revolution testcase compared to baseline and GEP-driven models. Ongoing work includes optimizing the framework configuration and extending it to 3D domains.
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...Xavier Llorà
A byproduct benefit of using probabilistic model-building genetic algorithms is the creation of cheap and accurate surrogate models. Learning classifier systems---and genetics-based machine learning in general---can greatly benefit from such surrogates which may replace the costly matching procedure of a rule against large data sets. In this paper we investigate the accuracy of such surrogate fitness functions when coupled with the probabilistic models evolved by the x-ary extended compact classifier system (xeCCS). To achieve such a goal, we show the need that the probabilistic models should be able to represent all the accurate basis functions required for creating an accurate surrogate. We also introduce a procedure to transform populations of rules based into dependency structure matrices (DSMs) which allows building accurate models of overlapping building blocks---a necessary condition to accurately estimate the fitness of the evolved rules.
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...Atsushi Nitanda
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors
The document presents two main results:
1) Stochastic Gradient Descent (SGD) achieves linear convergence rates for expected classification error under a strong low noise condition. The number of iterations needed for an epsilon solution is O(log(1/epsilon)).
2) Averaged SGD (ASGD) achieves even faster linear convergence rates under the same condition, requiring O(log(1/epsilon)) iterations for an epsilon solution.
The results improve upon prior work by showing faster-than-sublinear convergence rates for more suitable loss functions like logistic loss. Toy experiments demonstrate the theoretical findings.
This document discusses optimization techniques for deep learning models, including stochastic gradient descent and data preprocessing. Stochastic gradient descent trains models faster than traditional gradient descent by using mini-batches of data, and often leads to better generalization. Data should be preprocessed by centering and normalizing inputs to aid optimization. Mini-batches should be shuffled and contain a mix of classes to improve training.
Linkage Learning for Pittsburgh LCS: Making Problems TractableXavier Llorà
Presentation by Xavier Llorà, Kumara Sastry, & David E. Goldberg showing how linkage learning is possible on Pittsburgh style learning classifier systems
The document discusses linear regression and maximum likelihood estimation in EViews. It provides an overview of the linear regression model and ordinary least squares (OLS) estimation. It then discusses how to perform OLS regression and test linear restrictions in EViews. Finally, it introduces maximum likelihood estimation and how it provides an alternative framework for estimation when the data is assumed to follow a particular probability distribution.
- This document discusses how data mining techniques have been successfully applied to various NASA data sources to predict software quality metrics like effort and defects.
- However, despite clear success, most of the data sources used in these examples have been discontinued or are no longer accessible.
- The main challenge is organizational rather than technological - how can we ensure the valuable lessons learned from this work are not wasted?
Slide for Arithmer Seminar given by Dr. Daisuke Sato (Arithmer) at Arithmer inc.
The topic is on "explainable AI".
"Arithmer Seminar" is weekly held, where professionals from within and outside our company give lectures on their respective expertise.
The slides are made by the lecturer from outside our company, and shared here with his/her permission.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
Compressed Sensing using Generative Modelkenluck2001
This document summarizes Kenneth Emeka Odoh's presentation on using generative models for compressed sensing. It introduces compressed sensing and describes how generative models like variational autoencoders (VAEs) and generative adversarial networks (GANs) can be used to solve the underdetermined linear systems that arise in compressed sensing. The document also compares VAEs to standard autoencoders and discusses how generative models may require fewer training examples due to regularization imposing structure on the problem.
PR-297: Training data-efficient image transformers & distillation through att...Jinwon Lee
안녕하세요 TensorFlow Korea 논문 읽기 모임 PR-12의 297번째 리뷰입니다
어느덧 PR-12 시즌 3의 끝까지 논문 3편밖에 남지 않았네요.
시즌 3가 끝나면 바로 시즌 4의 새 멤버 모집이 시작될 예정입니다. 많은 관심과 지원 부탁드립니다~~
(멤버 모집 공지는 Facebook TensorFlow Korea 그룹에 올라올 예정입니다)
오늘 제가 리뷰한 논문은 Facebook의 Training data-efficient image transformers & distillation through attention 입니다.
Google에서 나왔던 ViT논문 이후에 convolution을 전혀 사용하지 않고 오직 attention만을 이용한 computer vision algorithm에 어느때보다 관심이 높아지고 있는데요
이 논문에서 제안한 DeiT 모델은 ViT와 같은 architecture를 사용하면서 ViT가 ImageNet data만으로는 성능이 잘 안나왔던 것에 비해서
Training 방법 개선과 새로운 Knowledge Distillation 방법을 사용하여 mageNet data 만으로 EfficientNet보다 뛰어난 성능을 보여주는 결과를 얻었습니다.
정말 CNN은 이제 서서히 사라지게 되는 것일까요? Attention이 computer vision도 정복하게 될 것인지....
개인적으로는 당분간은 attention 기반의 CV 논문이 쏟아질 거라고 확신하고, 또 여기에서 놀라운 일들이 일어날 수 있을 거라고 생각하고 있습니다
CNN은 10년간 많은 연구를 통해서 발전해왔지만, transformer는 이제 CV에 적용된 지 얼마 안된 시점이라서 더 기대가 크구요,
attention이 inductive bias가 가장 적은 형태의 모델이기 때문에 더 놀라운 이들을 만들 수 있을거라고 생각합니다
얼마 전에 나온 open AI의 DALL-E도 그 대표적인 예라고 할 수 있을 것 같습니다. Transformer의 또하나의 transformation이 궁금하신 분들은 아래 영상을 참고해주세요
영상링크: https://youtu.be/DjEvzeiWBTo
논문링크: https://arxiv.org/abs/2012.12877
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryWai Nwe Tun
This document presents a system for human detection in UAV imagery using HOG descriptors calculated for superpixels extracted via SLIC clustering. The system extracts superpixels, calculates HOG descriptors for each, and classifies the descriptors using an AdaBoost classifier. Evaluation on two datasets shows the superpixel-based approach achieves 3% higher accuracy than the original HOG method while requiring more computational time for training.
Вычислительный эксперимент в молекулярной биофизике белков и биомембранIlya Klabukov
Вычислительный эксперимент в молекулярной биофизике белков и биомембран
Ефремов Роман Гербертович, доктор физико-математических наук, профессор, заместитель директора по науке Института биоорганической химии имени академиков М.М.Шемякина и Ю.А.Овчинникова РАН, руководитель Лаборатории моделирования биомолекулярных систем
synbio2012.ru
1) The document proposes a cardinality-constrained k-means clustering approach to address practical challenges with standard k-means, such as skewed clustering and sensitivity to outliers.
2) It formulates the problem as a mixed integer nonlinear program (MINLP) and provides a convex relaxation to the problem using semidefinite programming (SDP).
3) The approach provides optimality guarantees and a rounding algorithm to recover an integer feasible solution. Numerical experiments demonstrate competitive performance versus heuristics.
This document summarizes research on using algebraic multigrid (AMG) methods to solve equations modeling porous media flow. The key points are:
1) A spectral element-based AMG method is used to build a coarse level that represents important components in the problem's near-nullspace added by high contrast in properties.
2) Directly applying standard AMG to the resulting coarse problem is ineffective since it has a different structure than assumed.
3) A "three-level" approach is taken where the coarse problem is transformed to match assumptions of a standard AMG, which enables accurate and scalable solution of problems with millions of unknowns.
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...Xavier Llorà
This document summarizes research using a Pittsburgh Learning Classifier System (LCS) called GAssist to predict protein structure by determining coordination numbers (CN). The researchers tested GAssist on a dataset of over 250,000 protein residues, comparing it to support vector machines, Naive Bayes, and C4.5 decision trees. While support vector machines achieved the best accuracy, GAssist produced more interpretable and compact rule sets at the cost of lower performance. The researchers analyzed the interpretability and scalability of GAssist for this challenging bioinformatics problem, identifying avenues for improving its accuracy while maintaining explanatory power.
Gradient boosting is an ensemble technique that combines weak learners, such as decision trees, into a single strong learner. It works by iteratively fitting a prediction model to the negative gradient of the loss function, which represents the residuals or errors from the previous iteration. This process of residual fitting helps reduce bias and variance in the model. Gradient boosting has advantages over other techniques like bagging in that it can reduce both bias and variance through its iterative process, while avoiding overfitting by using weak learners and a regularization parameter.
This document provides a final report on a project to classify particle collision data from the Large Hadron Collider using machine learning models. In the first part, the author conducts exploratory data analysis on the training data, which has 250,000 examples and 33 features. Notable findings include some complete and relevant features, correlations between features, and imbalanced target classes. In the second part, the author experiments with various machine learning models, including KNN, logistic regression, decision trees, and gradient boosting models. LightGBM performs best, achieving 84% accuracy on a validation set. Hyperparameter tuning is then used to further improve LightGBM performance.
This document discusses sampling algorithms for quickly sampling in high-dimensional configuration spaces. It introduces configuration space and roadmaps, describes two main ideas for improving sampling, and presents a utility-guided multi-query sampling algorithm evaluated in experiments.
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...Julián Urbano
Reliable evaluation of Information Retrieval systems requires large amounts of relevance judgments. Making these annotations is quite complex and tedious for many Music Information Retrieval tasks, so performing such evaluations requires too much effort. A low-cost alternative is the application of Minimal Test Collection algorithms, which offer quite reliable results while significantly reducing the annotation effort. The idea is to incrementally select what documents to judge so that we can compute estimates of the effectiveness differences between systems with a certain degree of confidence. In this paper we show a first approach towards its application to the evaluation of the Audio Music Similarity and Retrieval task, run by the annual MIREX evaluation campaign. An analysis with the MIREX 2011 data shows that the judging effort can be reduced to about 35% to obtain results with 95% confidence.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
2. Inversion of the RTE
Once solution of RTE is known:
comparison between Stokes spectra of synthetic and observed
spectrum
trial-and-error changes of the initial parameters of the atmosphere
(„human inversions“)
until observed and synthetic (fitted) profile matches
Inversions:
Nothing else but an optimization of the trial-and-error part
Problem:
Inversions always find a solution within the given model
atmosphere. Solution is seldomly unique (might even be
completely wrong).
Goal of this lecture:
Principles of genetic algorithms
Learn the usage of the HeLIx+ inversion code, develop a feeling
on the reliability of inversion results.
A. Lagg - Abisko Winter School 2
3. The merit function
The quality of the model atmosphere must be evaluated
Stokes profiles represent discrete sampled functions
widely used: chisqr definition
weight
sum over (also WL-dep)
sum over
WL-pixels
number of free Stokes
parameters
RTE gives the Stokes spectrum Issyn
The unknowns of the system are the (height dependent) model
parameters:
A. Lagg - Abisko Winter School 3
4. HeLIx+ overview of features
includes Zeeman, Paschen-Back, Hanle effect (He 10830)
atomic polarization for He 10830 (He D3)
magneto-optical effects
fitting / removing telluric lines
fitting unknown parameters of spectral lines
various methods for continuum correction / fitting
convolution with instrument filter profiles
user-defined weighting scheme
direct read access to SOT/SP, VTT-TIP2, SST-CRISP, ...
flexible atomic data configuration
extensive IDL based display routines
MPI support (to invert maps)
Download from http://www.mps.mpg.de/homes/lagg
GBSO download-section helix
use invert and IR$soft
A. Lagg - Abisko Winter School 4
5. The inversion technique: reliability
Two minimizations
steepest gradient
Pikaia
implemented:
Levenberg-Marquardt:
requires good initial
guess
PIKAIA (genetic
algorithm, Charbonneau
1995):
no initial guess needed
planned: DIRECT
algorithm (good
compromise between
global min and speed)
A. Lagg - Abisko Winter School 5
6. Initial guess problem
Having a good initial guess for the iteration process improves both
the speed and the convergence of the inversion.
A. Lagg - Abisko Winter School 6
7. Initial guess optimizations
Weak field initialization Auer77 initialization
Other methods:
Artificial Neural Networks (ANN)
MDI / magnetograph formulae
use a minimization technique which does not rely on initial
guess values
A. Lagg - Abisko Winter School 7
8. Genetic algorithms P. Spijker, TU Eindhoven
Genetic algorithms (GA‟s) are a
technique to solve problems which need
optimization
GA‟s are a subclass of Evolutionary
Computing
GA‟s are based on Darwin‟s theory of
evolution
History of GA‟s:
Evolutionary computing evolved in the
1960‟s.
GA‟s were created by John Holland in
the mid-70‟s.
A. Lagg - Abisko Winter School 8
9. Advantages / drawbacks
No derivatives of the goodness of fit function with
respect to model parameters need be computed; it
matters little whether the relationship between the
model and its parameters is linear or nonlinear.
Nothing in the procedure outlined above depends
critically on using a least-squares statistical estimator;
any other robust estimator can be substituted, with little
or no changes to the overall procedure.
In most real applications, the model will need to be
evaluated (i.e., given a parameter set, compute a
synthetic dataset and its associated goodness of fit) a
great many times; if this evaluation is computationally
expensive, the forward modeling approach can become
impractical.
A. Lagg - Abisko Winter School 9
10. Evolution in biology
Each cell of a living thing contains chromosomes - strings of
DNA
Each chromosome contains a set of genes - blocks of DNA
Each gene determines some aspect of the organism (like eye
colour)
A collection of genes is sometimes called a genotype
A collection of aspects (like eye colour) is sometimes called a phenotype
Reproduction involves recombination of genes from parents
and then small amounts of mutation (errors) in copying
The fitness of an organism is how much it can reproduce
before it dies
Evolution based on “survival of the fittest”
A. Lagg - Abisko Winter School 10
11. Biological reproducion
During reproduction “errors” occur
Due to these “errors” genetic variation exists
Most important “errors” are:
Recombination (cross-over)
Mutation
A. Lagg - Abisko Winter School 11
12. Natural selection
The origin of species: “Preservation of favourable
variations and rejection of unfavourable variations.”
There are more individuals born than can survive, so there
is a continuous struggle for life.
Individuals with an advantage have a greater chance for
survive: survival of the fittest.
Important aspects in natural selection are:
adaptation to the environment
isolation of populations in different groups which cannot
mutually mate
If small changes in the genotypes of individuals are
expressed easily, especially in small populations, we speak
of genetic drift
“success in life”: mathematically expressed as fitness
A. Lagg - Abisko Winter School 12
13. How to apply to RTE? David Hales (www.davidhales.com)
GA‟s often encode solutions as fixed length “bitstrings”
(e.g. 101110, 111111, 000101)
Each bit represents some aspect of the proposed
solution to the problem
For GA‟s to work, we need to be able to “test” any
string and get a “score” indicating how “good” that
solution is
definition of “fitness function” required: convenient to
use chisqr merit function
GA‟s improve the fitness – maximization technique
A. Lagg - Abisko Winter School 13
14. Example – Drilling for oil David Hales (www.davidhales.com)
Imagine you had to drill for oil somewhere along a
single 1km desert road
Problem: choose the best place on the road that
produces the most oil per day
We could represent each solution as a position on the
road
Say, a whole number between [0..1000]
Solution1 = 300 Solution2 = 900
Road
0 500 1000
A. Lagg - Abisko Winter School 14
15. Encoding problem
The set of all possible solutions [0..1000] is called the
search space or state space
In this case it‟s just one number but it could be many
numbers or symbols
Often GA‟s code numbers in binary producing a
bitstring representing a solution
In our example we choose 10 bits which is enough to
represent 0..1000
512 256 128 64 32 16 8 4 2 1
900 1 1 1 0 0 0 0 1 0 0
300 0 1 0 0 1 0 1 1 0 0
1023 1 1 1 1 1 1 1 1 1 1
In GA‟s these encoded strings are sometimes called “genotypes” or
“chromosomes” and the individual bits are sometimes called “genes”
A. Lagg - Abisko Winter School 15
16. Fitness of oil function
Solution1 = 300 Solution2 = 900
(0100101100) (1110000100)
Road
0 1000
OIL
30
5
Location
A. Lagg - Abisko Winter School 16
17. Search space
Oil example: search space is one dimensional
(and stupid: how to define a fitness function?).
RTE: encoding several values into the
chromosome many dimensions can be
searched
Search space an be visualised as a surface or
fitness landscape in which fitness dictates
height (fitness / chisqr hypersurface)
Each possible genotype is a point in the space
A GA tries to move the points to better places
(higher fitness) in the space
A. Lagg - Abisko Winter School 17
19. Search space
Obviously, the nature of the search space
dictates how a GA will perform
A completely random space would be bad for
a GA
Also GA‟s can, in practice, get stuck in local
maxima if search spaces contain lots of
these
Generally, spaces in which small
improvements get closer to the global
optimum are good
A. Lagg - Abisko Winter School 19
20. The algorithm
Generate a set of random solutions
Repeat
Test each solution in the set (rank them)
Remove some bad solutions from set
Duplicate some good solutions
make small changes to some of them
Until best solution is good enough
How to duplicate good solutions?
A. Lagg - Abisko Winter School 20
21. Adding Sex
Two high scoring “parent” bit strings
sex
(chromosomes) are selected and with some
probability (crossover rate) combined
result of sex
Producing two new offsprings (bit strings)
parents are seldom
Each offspring may then be changed randomly
happy with the
(mutation) result
Selecting parents: many schemes possible,
example:
Roulette Wheel
Add up the fitness's of all chromosomes
Generate a random number R in that range
Select the first chromosome in the population
that - when all previous fitness‟s are added -
gives you at least the value R
A. Lagg - Abisko Winter School 21
22. Example population
No. Chromosome Fitness
1 1010011010 1
2 1111100001 2
3 1011001100 3
4 1010000000 1
5 0000010000 3
6 1001011111 5
7 0101010101 1
8 1011100111 2
sum: 18
A. Lagg - Abisko Winter School 22
23. Roulette Wheel Selection
1 2 3 4 5 6 7 8
1 2 3 1 3 5 1 2
0 Rnd[0..18] = 7 Rnd[0..18] = 12 18
Chromosome4 Chromosome6
Parent1 Parent2
Higher chance of picking a fit chromosome!
A. Lagg - Abisko Winter School 23
24. Crossover - Recombination
1011011111
1010000000 Parent1 Offspring1
1010000000
1001011111 Offspring2
Parent2
Crossover
single point - With some high probability (crossover
random rate) apply crossover to the parents.
(typical values are 0.8 to 0.95)
A. Lagg - Abisko Winter School 24
25. Mutation
mutate
1011001111
1011011111 Offspring1
Offspring1
1000000000
1010000000 Offspring2
Offspring2
Original offspring Mutated offspring
With some small probability (the mutation rate)
flip each bit in the offspring (typical values
between 0.1 and 0.001)
A. Lagg - Abisko Winter School 25
26. Improved algorithm
Generate a population of random chromosomes
Repeat (each generation)
Calculate fitness of each chromosome
Repeat
Use roulette selection to select pairs of
parents
Generate offspring with crossover and
mutation
Until a new population has been produced
Until best solution is good enough
A. Lagg - Abisko Winter School 26
27. Many Variants of GA
Different kinds of selection (not roulette):
Tournament, Elitism, etc.
Different recombination:
one-point crossover, multi-point crossover, 3 way crossover etc.
Different kinds of encoding other than bitstring
Integer values, Ordered set of symbols
Different kinds of mutation
variable mutation rate
Different reduction plans
controls how newly bred offsprings are inserted into the
population
PIKAIA (Charbonneau, 1995)
A. Lagg - Abisko Winter School 27
29. List of ME Codes (incomplete)
HeLIx+
A. Lagg, most flexible code (multi-comp, multi line), He 10830
Hanle slab model implemented. Genetic algorithm Pikaia. Fully
parallel.
VFISV
J.M.Borrero, for SDO HMI. Fastest ME code available. F90, fully
parallel. Levenberg-Marquardt with some optimizations.
MERLIN
Written by Jose Garcia at HAO in C, C++ and some other routines
in Fortran. (Lites et al. 2007 in Il Nouvo Cimento)
MELANIE
Hector Socas at HAO. In F90, not parallel. Numerical derivatives.
HAZEL
Artoro Lopez Ariste et al. (2008). Optimized for He 10830, He
D3, Hanle-slab model.
MILOS
Orozco Suarez et al. (2007), IDL, some papers published with it
A. Lagg - Abisko Winter School 29
30. Installation & Usage of HeLIX+
Follow instructions on user„s manual:
Basic usage:
1-component model, create & invert synthetic spectrum
discuss problems:
parameter crosstalk
uniqueness of solution
stability & reliability
influence of noise
Download from http://www.mps.mpg.de/homes/lagg
GBSO download-section helix
use invert and IR$soft
A. Lagg - Abisko Winter School 30
31. Exercise II:
HeLIx+ installation and basic usage
install and run IDL interface of Synthesis
HeLIx+ add complexity to atmospheric
the first input file: synthesis of model (stray-light, multi-
Fe I 6302.5 component)
add 2nd spectral line (Fe
change atmospheric
parameters (B, INC, …) 6301.5)
change line parameters
(quantum numbers, geff) blind tests:
display Zeeman pattern take synthetic profile from
add noise someone else and invert it
1st inversion
play with noise level / initial Which parameters are robust?
values / parameter range How can robustness be
weighting scheme improved?
Download first input file: abisko_1c.ipt
http://www.mps.mpg.de/homes/lagg/
A. Lagg - Abisko Winter School 31