This document discusses training domain scientists in computational and data skills. It notes the increasing amount of data in fields like astronomy and challenges of traditional approaches. It advocates teaching skills like statistics, machine learning, and programming. Examples are given of bootcamps, seminars and degree programs in these areas at UC Berkeley taught by CS and statistics faculty. Challenges discussed include fitting such training into formal curricula and ensuring participation from underrepresented groups. The creation of collaborative spaces is proposed to better connect domain scientists with methodological experts to help scientists address the growing role of data in their fields.
Estamos nós aqui novamente, nos deparando com mais um erro de interpretação de um artigo científico que transforma uma descoberta feita por cientistas sérios em uma série infindável de posts, textos e tudo mais a respeito de uma estrutura alienígena construída ao redor da estrela KIC 8462852, que não faz sentido nenhum. O intuito desse post é mais uma vez esclarecer todos os pontos dessa descoberta, acompanhado dos artigos e de um vídeo no meu canal onde explico todos os detalhes a respeito de exoplanetas, exocometas, Kepler e a pesquisa séria realizada pelos voluntários do projeto de ciência cidadã, Planet Hunters. Boa leitura.
“Bizarro”. “Interessante”. “Trânsito Gigante”. Essas foram as reações dos voluntários do projeto Planet Hunters quando eles olharam pela primeira vez a curva de luz da estrela parecida com o Sol, outrora normal, KIC 8462852.
Das mais de 150000 estrelas, sob constante observação durante os 4 anos da missão primária do Kepler da NASA, entre 2009 e 2013, essa estrela se destacou devido às inexplicáveis quedas no brilho de sua luz. Enquanto que quase todo mundo aposta em causas naturais para essa queda estranha no brilho da estrela, alguns sugeriram outras possibilidades.
Você lembrará que o observatório orbital Kepler, continuamente monitorou estrelas num campo de visão fixo focado nas constelações de Lyra e Cygnus, na esperança de registrar quedas periódicas no brilho da luz das estrelas, quedas essas geradas por exoplanetas em trânsito. Se uma queda no brilho da luz for observado, mais trânsitos eram observados para confirmar a detecção de um novo exoplaneta.
The search for_extraterrestrial_civilizations_with_large_energy_suppliesSérgio Sacani
Estamos nós aqui novamente, nos deparando com mais um erro de interpretação de um artigo científico que transforma uma descoberta feita por cientistas sérios em uma série infindável de posts, textos e tudo mais a respeito de uma estrutura alienígena construída ao redor da estrela KIC 8462852, que não faz sentido nenhum. O intuito desse post é mais uma vez esclarecer todos os pontos dessa descoberta, acompanhado dos artigos e de um vídeo no meu canal onde explico todos os detalhes a respeito de exoplanetas, exocometas, Kepler e a pesquisa séria realizada pelos voluntários do projeto de ciência cidadã, Planet Hunters. Boa leitura.
“Bizarro”. “Interessante”. “Trânsito Gigante”. Essas foram as reações dos voluntários do projeto Planet Hunters quando eles olharam pela primeira vez a curva de luz da estrela parecida com o Sol, outrora normal, KIC 8462852.
Das mais de 150000 estrelas, sob constante observação durante os 4 anos da missão primária do Kepler da NASA, entre 2009 e 2013, essa estrela se destacou devido às inexplicáveis quedas no brilho de sua luz. Enquanto que quase todo mundo aposta em causas naturais para essa queda estranha no brilho da estrela, alguns sugeriram outras possibilidades.
Você lembrará que o observatório orbital Kepler, continuamente monitorou estrelas num campo de visão fixo focado nas constelações de Lyra e Cygnus, na esperança de registrar quedas periódicas no brilho da luz das estrelas, quedas essas geradas por exoplanetas em trânsito. Se uma queda no brilho da luz for observado, mais trânsitos eram observados para confirmar a detecção de um novo exoplaneta.
Estamos nós aqui novamente, nos deparando com mais um erro de interpretação de um artigo científico que transforma uma descoberta feita por cientistas sérios em uma série infindável de posts, textos e tudo mais a respeito de uma estrutura alienígena construída ao redor da estrela KIC 8462852, que não faz sentido nenhum. O intuito desse post é mais uma vez esclarecer todos os pontos dessa descoberta, acompanhado dos artigos e de um vídeo no meu canal onde explico todos os detalhes a respeito de exoplanetas, exocometas, Kepler e a pesquisa séria realizada pelos voluntários do projeto de ciência cidadã, Planet Hunters. Boa leitura.
“Bizarro”. “Interessante”. “Trânsito Gigante”. Essas foram as reações dos voluntários do projeto Planet Hunters quando eles olharam pela primeira vez a curva de luz da estrela parecida com o Sol, outrora normal, KIC 8462852.
Das mais de 150000 estrelas, sob constante observação durante os 4 anos da missão primária do Kepler da NASA, entre 2009 e 2013, essa estrela se destacou devido às inexplicáveis quedas no brilho de sua luz. Enquanto que quase todo mundo aposta em causas naturais para essa queda estranha no brilho da estrela, alguns sugeriram outras possibilidades.
Você lembrará que o observatório orbital Kepler, continuamente monitorou estrelas num campo de visão fixo focado nas constelações de Lyra e Cygnus, na esperança de registrar quedas periódicas no brilho da luz das estrelas, quedas essas geradas por exoplanetas em trânsito. Se uma queda no brilho da luz for observado, mais trânsitos eram observados para confirmar a detecção de um novo exoplaneta.
The search for_extraterrestrial_civilizations_with_large_energy_suppliesSérgio Sacani
Estamos nós aqui novamente, nos deparando com mais um erro de interpretação de um artigo científico que transforma uma descoberta feita por cientistas sérios em uma série infindável de posts, textos e tudo mais a respeito de uma estrutura alienígena construída ao redor da estrela KIC 8462852, que não faz sentido nenhum. O intuito desse post é mais uma vez esclarecer todos os pontos dessa descoberta, acompanhado dos artigos e de um vídeo no meu canal onde explico todos os detalhes a respeito de exoplanetas, exocometas, Kepler e a pesquisa séria realizada pelos voluntários do projeto de ciência cidadã, Planet Hunters. Boa leitura.
“Bizarro”. “Interessante”. “Trânsito Gigante”. Essas foram as reações dos voluntários do projeto Planet Hunters quando eles olharam pela primeira vez a curva de luz da estrela parecida com o Sol, outrora normal, KIC 8462852.
Das mais de 150000 estrelas, sob constante observação durante os 4 anos da missão primária do Kepler da NASA, entre 2009 e 2013, essa estrela se destacou devido às inexplicáveis quedas no brilho de sua luz. Enquanto que quase todo mundo aposta em causas naturais para essa queda estranha no brilho da estrela, alguns sugeriram outras possibilidades.
Você lembrará que o observatório orbital Kepler, continuamente monitorou estrelas num campo de visão fixo focado nas constelações de Lyra e Cygnus, na esperança de registrar quedas periódicas no brilho da luz das estrelas, quedas essas geradas por exoplanetas em trânsito. Se uma queda no brilho da luz for observado, mais trânsitos eram observados para confirmar a detecção de um novo exoplaneta.
Solar System Processing with LSST: A Status UpdateMario Juric
An update for the LSST Solar System Science Collaboration on the work in progress on data products and software needed to support the Solar System science. Delivered at DPS 2017 meeting.
A seven-planet resonant chain in TRAPPIST-1Sérgio Sacani
The TRAPPIST-1 system is the first transiting planet system
found orbiting an ultracool dwarf star1. At least seven planets
similar in radius to Earth were previously found to transit
this host star2. Subsequently, TRAPPIST-1 was observed as
part of the K2 mission and, with these new data, we report
the measurement of an 18.77 day orbital period for the outermost
transiting planet, TRAPPIST-1 h, which was previously
unconstrained. This value matches our theoretical expectations
based on Laplace relations3 and places TRAPPIST-1 h
as the seventh member of a complex chain, with three-body
resonances linking every member. We find that TRAPPIST-1 h
has a radius of 0.752 R⊕ and an equilibrium temperature of
173 K. We have also measured the rotational period of the star
to be 3.3 days and detected a number of flares consistent with
a low-activity, middle-aged, late M dwarf.
ASTRONOMICAL OBJECTS DETECTION IN CELESTIAL BODIES USING COMPUTER VISION ALGO...csandit
Computer vision, astronomy, and astrophysics function quite productively together to the point where they are completely logical for each other. Out of computer vision algorithms the
progress of astronomy and astrophysics would have slowed down to reasonably a deadlock. The new researches and calculations can lead to more information as well as higher quality of data. Consequently, an organized view on planetary surfaces can change all in the long run. A new
discovery would be a puzzling complexity or a possible branching of paths, yet the quest to know more about the celestial bodies by dint of computer vision algorithms will continue. The detection of astronomical objects in celestial bodies is a challenging task. This paper presents
an implementation of how to detect astronomical objects in celestial bodies using computer vision algorithm with satisfactory performance. It also puts forward some observations linked
among computer vision, astronomy, and astrophysics.
The advancement of technology in the last decade or so has allowed astronomy to see exponential growth in data volumes. ESA's space telescope Euclid will gather high-resolution images of a third of the sky, ~850GB of data downloaded daily for 6 years, by 2032 ground-based telescope LSST will have generated 500PB of data and the radio telescope SKA will be producing more data per second than the entire internet worldwide. This talk will address the questions of what current techniques exist to address big data volumes, how the astronomical community will prepare for this big data wave, and what other challenges lie ahead?
Describes data science efforts at Berkeley, with a particular focus on teaching and the new Berkeley Institute for Data Science (BIDS), funded by the Moore and Sloan Foundations. BIDS will be a space for the open and interdisciplinary work that is typical of the SciPy community. In the creation of BIDS, open source scientific tools for data science, and specifically the SciPy ecosystem, played an important role.
Solar System Processing with LSST: A Status UpdateMario Juric
An update for the LSST Solar System Science Collaboration on the work in progress on data products and software needed to support the Solar System science. Delivered at DPS 2017 meeting.
A seven-planet resonant chain in TRAPPIST-1Sérgio Sacani
The TRAPPIST-1 system is the first transiting planet system
found orbiting an ultracool dwarf star1. At least seven planets
similar in radius to Earth were previously found to transit
this host star2. Subsequently, TRAPPIST-1 was observed as
part of the K2 mission and, with these new data, we report
the measurement of an 18.77 day orbital period for the outermost
transiting planet, TRAPPIST-1 h, which was previously
unconstrained. This value matches our theoretical expectations
based on Laplace relations3 and places TRAPPIST-1 h
as the seventh member of a complex chain, with three-body
resonances linking every member. We find that TRAPPIST-1 h
has a radius of 0.752 R⊕ and an equilibrium temperature of
173 K. We have also measured the rotational period of the star
to be 3.3 days and detected a number of flares consistent with
a low-activity, middle-aged, late M dwarf.
ASTRONOMICAL OBJECTS DETECTION IN CELESTIAL BODIES USING COMPUTER VISION ALGO...csandit
Computer vision, astronomy, and astrophysics function quite productively together to the point where they are completely logical for each other. Out of computer vision algorithms the
progress of astronomy and astrophysics would have slowed down to reasonably a deadlock. The new researches and calculations can lead to more information as well as higher quality of data. Consequently, an organized view on planetary surfaces can change all in the long run. A new
discovery would be a puzzling complexity or a possible branching of paths, yet the quest to know more about the celestial bodies by dint of computer vision algorithms will continue. The detection of astronomical objects in celestial bodies is a challenging task. This paper presents
an implementation of how to detect astronomical objects in celestial bodies using computer vision algorithm with satisfactory performance. It also puts forward some observations linked
among computer vision, astronomy, and astrophysics.
The advancement of technology in the last decade or so has allowed astronomy to see exponential growth in data volumes. ESA's space telescope Euclid will gather high-resolution images of a third of the sky, ~850GB of data downloaded daily for 6 years, by 2032 ground-based telescope LSST will have generated 500PB of data and the radio telescope SKA will be producing more data per second than the entire internet worldwide. This talk will address the questions of what current techniques exist to address big data volumes, how the astronomical community will prepare for this big data wave, and what other challenges lie ahead?
Describes data science efforts at Berkeley, with a particular focus on teaching and the new Berkeley Institute for Data Science (BIDS), funded by the Moore and Sloan Foundations. BIDS will be a space for the open and interdisciplinary work that is typical of the SciPy community. In the creation of BIDS, open source scientific tools for data science, and specifically the SciPy ecosystem, played an important role.
Building Data Literacy Among Middle School Administrators and Teachers
Data literacy is an essential trait for middle school administrators and teachers to possess. In this session, the Research and Accountability Team from Durham Public Schools will discuss how it has expanded its focus on Data-to-Action to building data literacy amongst its middle school administrators and teachers during 2013-14.
J. Brent Cooper, Terri Mozingo & Karin Beckett Durham Public Schools - Durham, NC
Ethics of Big Data is about finding alignment between an organization's core values and their day-to-day actions in a way that balances risk and innovation. As Big Data brings business operations and practices deeper and more fully into individual lives, it is creating a forcing function that raises ethical questions about our values around concepts like identity, privacy, ownership, and reputation. How we understand those values and align them with our actions when innovating products and services using Big Data technologies benefits from a framework that provides a common vocabulary and encourages explicit discussion.
The material will address the intersection of ethics and Big Data; what it is and what it isn't. Specifically, how to approach and generate dialog about an abstract subject with direct, real-world implications. A general framework for talking about ethics in the context of Big Data will be introduced.
Aspects include:
1. Direct relevance to your data handling practices
2. How Big Data is influencing important concepts including identity, privacy, ownership, and reputation
3. Ethical Decision Points
4. Value Personas as a tool for encouraging discussion and generating agreement and alignment between values and actions
5. Balancing the benefits of Big Data innovation and the risks of harm
The webcast will present key concepts from the forthcoming book Ethics of Big Data
Promoting Data Literacy at the Grassroots (ACRL 2015, Portland, OR)Adam Beauchamp
Presentation given at ACRL 2015, with Christine Murray, on teaching undergraduate students to discover and evaluate datasets for secondary data analysis.
My presentation in Week of Robotics, Helsinki, Finland on November 28th, 2014. My purpose was to initiate discussion about the possibilities and risks of using Big Data in combination with robotics, especially from ethical perspective. My main reference was Davis & Patterson (2012): Ethics of Big Data which I recommend as further reading.
Computational Training for Domain Scientists & Data LiteracyJoshua Bloom
Data literacy connects the knowledge of deep concepts of statistics, computer science, visualization, and domain science with the practical understanding of the data---and the stories it tells---that we encounter in our lives. As data becomes more pervasive, we see teaching data literacy to students as part of a broad education as an imperative for 21st century. Learning how to arm the next generation with the tools to make the most of data, and avoid the common pitfalls of data, will be a major thrust of our efforts with the Moore/Sloan initiative at Berkeley, through the Berkeley Institute for Data Science.
Astronomical Data Processing on the LSST Scale with Apache SparkDatabricks
The next decade promises to be exciting for both astronomy and computer science with a number of large-scale astronomical surveys in preparation. One of the most important ones is Large Scale Survey Telescope, or LSST. LSST will produce the first ‘video’ of the deep sky in history by continually scanning the visible sky and taking one 3.2 giga-pixel image every 20 seconds. In this talk we will describe LSST’s unique design and how its image processing pipeline produces catalogs of astronomical objects. To process and quickly cross-match catalog data we built AXS (Astronomy Extensions for Spark), a system based on Apache Spark. We will explain its design and what is behind its great cross-matching performance.
Toward a Global Interactive Earth Observing CyberinfrastructureLarry Smarr
05.01.12
Invited Talk to the 21st International Conference on Interactive Information Processing Systems (IIPS) for Meteorology, Oceanography, and Hydrology Held at the 85th AMS Annual Meeting
Title: Toward a Global Interactive Earth Observing Cyberinfrastructure
San Diego, CA
Identifying Exoplanets with Machine Learning Methods: A Preliminary StudyIJCI JOURNAL
The discovery of habitable exoplanets has long been a heated topic in astronomy. Traditional methods for exoplanet identification include the wobble method, direct imaging, gravitational microlensing, etc., which not only require a considerable investment of manpower, time, and money, but also are limited by the performance of astronomical telescopes. In this study, we proposed the idea of using machine learning methods to identify exoplanets. We used the Kepler dataset collected by NASA from the Kepler Space Observatory to conduct supervised learning, which predicts the existence of exoplanet candidates as a three-categorical classification task, using decision tree, random forest, naïve Bayes, and neural network; we used another NASA dataset consisted of the confirmed exoplanets data to conduct unsupervised learning, which divides the confirmed exoplanets into different clusters, using k-means clustering. As a result, our models achieved accuracies of 99.06%, 92.11%, 88.50%, and 99.79%, respectively, in the supervised learning task and successfully obtained reasonable clusters in the unsupervised learning task.
A Search for Technosignatures Around 11,680 Stars with the Green Bank Telesco...Sérgio Sacani
We conducted a search for narrowband radio signals over four observing sessions in 2020–2023 with
the L-band receiver (1.15–1.73 GHz) of the 100 m diameter Green Bank Telescope. We pointed the
telescope in the directions of 62 TESS Objects of Interest, capturing radio emissions from a total of
∼11,860 stars and planetary systems in the ∼9 arcminute beam of the telescope. All detections were
either automatically rejected or visually inspected and confirmed to be of anthropogenic nature. In
this work, we also quantified the end-to-end efficiency of radio SETI pipelines with a signal injection
and recovery analysis. The UCLA SETI pipeline recovers 94.0% of the injected signals over the usable
frequency range of the receiver and 98.7% of the injections when regions of dense RFI are excluded. In
another pipeline that uses incoherent sums of 51 consecutive spectra, the recovery rate is ∼15 times
smaller at ∼6%. The pipeline efficiency affects SETI search volume calculations as well as calculations
of upper bounds on the number of transmitting civilizations. We developed an improved Drake Figure
of Merit for SETI search volume calculations that includes the pipeline efficiency and frequency drift
rate coverage. Based on our observations, we found that there is a high probability (94.0–98.7%) that
fewer than ∼0.014% of stars earlier than M8 within 100 pc host a transmitter that is detectable in
our search (EIRP > 1012 W). Finally, we showed that the UCLA SETI pipeline natively detects the
signals detected with AI techniques by Ma et al. (2023).
Invited talk for the Square Kilometer Array meeting in Wellington New Zealand in Sept 2011 on Semantic eScience and Semantically enabled Virtual Observatories along with directions
The Possible Tidal Demise of Kepler’s First Planetary SystemSérgio Sacani
We present evidence of tidally-driven inspiral in the Kepler-1658 (KOI-4) system, which consists of a giant planet
(1.1RJ, 5.9MJ) orbiting an evolved host star (2.9Re, 1.5Me). Using transit timing measurements from Kepler,
Palomar/WIRC, and TESS, we show that the orbital period of Kepler-1658b appears to be decreasing at a rate = -
+ P 131 22
20 ms yr−1
, corresponding to an infall timescale P P » 2.5 Myr. We consider other explanations for the
data including line-of-sight acceleration and orbital precession, but find them to be implausible. The observed
period derivative implies a tidal quality factor
¢ = ´ -
+ Q 2.50 10 0.62
0.85 4, in good agreement with theoretical
predictions for inertial wave dissipation in subgiant stars. Additionally, while it probably cannot explain the entire
inspiral rate, a small amount of planetary dissipation could naturally explain the deep optical eclipse observed for
the planet via enhanced thermal emission. As the first evolved system with detected inspiral, Kepler-1658 is a new
benchmark for understanding tidal physics at the end of the planetary life cycle
Cyberinfrastructure to Support Ocean ObservatoriesLarry Smarr
05.03.18
Invited Talk to the Ocean Studies Board
National Research Council
Title: Cyberinfrastructure to Support Ocean Observatories
University of California San Diego
Research Data Infrastructure for Geochemistry (DFG Roundtable)Kerstin Lehnert
This presentation provides an overview of different aspects of data management for geochemistry and resources available at the EarthChem@IEDA data facility.
Artigo que descreve a descoberta do exoplaneta Kepler-432b, um exoplaneta mais massivo que Júpiter que orbita uma estrela gigante vermelha bem próximo e numa órbita extremamente alongada.
Autoencoding RNN for inference on unevenly sampled time-series dataJoshua Bloom
For the past decade, feature-engineering-based approaches applied to the discovery of transients and the characterization of tens of thousands of variable stars led the way to novel astronomical inference. Here I will show that new auto-encoder recurrent neural network architectures, without hand-crafted features, rival those traditional methods. Autonomous discovery and inference are part of a larger worldwide onus to federate precious (and heterogeneous) follow-up resources to maximize our collective scientific returns.
The ongoing digitization of the industrial-scale machines that power and enable human activity is itself a major global transformation. But the real revolution—in efficiencies, in improved and saved lives—will happen as machine learning automation and insights are properly coupled to the complex systems of industrial data. Leveraging a systems view of real-world use cases from aviation to transportation, I contrast the needs and approaches of consumer versus industrial machine learning. Particularly, I focus on three key areas: combining physics-based models to data-driven models, differential privacy and secure ML (including edge-to-cloud strategies), and interpretability of model predictions.
PyData 2015 Keynote: "A Systems View of Machine Learning" Joshua Bloom
Despite the growing abundance of powerful tools, building and deploying machine-learning frameworks into production continues to be major challenge, in both science and industry. I'll present some particular pain points and cautions for practitioners as well as recent work addressing some of the nagging issues. I advocate for a systems view, which, when expanded beyond the algorithms and codes to the organizational ecosystem, places some interesting constraints on the teams tasked with development and stewardship of ML products.
About: Dr. Joshua Bloom is an astronomy professor at the University of California, Berkeley where he teaches high-energy astrophysics and Python for data scientists. He has published over 250 refereed articles largely on time-domain transients events and telescope/insight automation. His book on gamma-ray bursts, a technical introduction for physical scientists, was published recently by Princeton University Press. He is also co-founder and CTO of wise.io, a startup based in Berkeley. Josh has been awarded the Pierce Prize from the American Astronomical Society; he is also a former Sloan Fellow, Junior Fellow at the Harvard Society, and Hertz Foundation Fellow. He holds a PhD from Caltech and degrees from Harvard and Cambridge University.
Large-Scale Inference in Time Domain AstrophysicsJoshua Bloom
Presented at the 2014 Workshop on Algorithms for Modern Massive Data Sets (MMDS 2014), June 19, 2014 (Berkeley, CA):
The scientific promise of modern astrophysical surveys - from exoplanets to gravity waves - is palpable. Yet extracting insight from the data deluge is neither guaranteed nor trivial: existing paradigms for analysis are already beginning to breakdown under the data velocity. I will describe our efforts to apply statistical machine learning to large-scale astronomy datasets both in batch and streaming mode. From the discovery of supernovae to the characterization of tens of thousands of variable stars such approaches are leading the way to novel inference. Specific discoveries concerning precision distance measurements and using LSST as a pseudo-spectrograph will be discussed.
Joshua Bloom: Machine Learning and Classification in the Synoptic Survey EraJoshua Bloom
This is a talk given at the "From Data to Knowledge" Workshop on streaming data in Berkeley, California.
YouTube: http://www.youtube.com/watch?v=aEoj7eHh6Gg&feature=plcp
Twitter: @profjsb
http://lyra.berkeley.edu/CDIConf/program.html
This is a presentation by Dada Robert in a Your Skill Boost masterclass organised by the Excellence Foundation for South Sudan (EFSS) on Saturday, the 25th and Sunday, the 26th of May 2024.
He discussed the concept of quality improvement, emphasizing its applicability to various aspects of life, including personal, project, and program improvements. He defined quality as doing the right thing at the right time in the right way to achieve the best possible results and discussed the concept of the "gap" between what we know and what we do, and how this gap represents the areas we need to improve. He explained the scientific approach to quality improvement, which involves systematic performance analysis, testing and learning, and implementing change ideas. He also highlighted the importance of client focus and a team approach to quality improvement.
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
Andreas Schleicher presents at the OECD webinar ‘Digital devices in schools: detrimental distraction or secret to success?’ on 27 May 2024. The presentation was based on findings from PISA 2022 results and the webinar helped launch the PISA in Focus ‘Managing screen time: How to protect and equip students against distraction’ https://www.oecd-ilibrary.org/education/managing-screen-time_7c225af4-en and the OECD Education Policy Perspective ‘Students, digital devices and success’ can be found here - https://oe.cd/il/5yV
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
The Art Pastor's Guide to Sabbath | Steve ThomasonSteve Thomason
What is the purpose of the Sabbath Law in the Torah. It is interesting to compare how the context of the law shifts from Exodus to Deuteronomy. Who gets to rest, and why?
How to Create Map Views in the Odoo 17 ERPCeline George
The map views are useful for providing a geographical representation of data. They allow users to visualize and analyze the data in a more intuitive manner.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Computational Training and Data Literacy for Domain Scientists
1. Computational Training & Data
Literacy for Domain Scientists
Joshua Bloom
UC Berkeley, Astronomy
@profjsb
“Training Students to Extract Value from Big Data” National Academies of Science, DC 11 April 2014
2. What is the toolbox
of the modern
(data-driven)
scientist?
domain
training
statistics
advanced
computing
database
GUI
parallel
visualization
Bayesian
machine learning
Physics
laboratory techniques
MCMC
MapReduce
3. And...How do we teach
this with what little time
the students have?
What is the toolbox
of the modern
(data-driven)
scientist?
7. Our ML framework found the
Nearest Supernova in 3 Decades ..‣ Built & Deployed Real-
time ML framework,
discovering >10,000
events in > 10 TB of
imaging
→ 50+ journal articles
‣ Built Probabilistic
Event classification
catalogs with
innovative active
learning
http://timedomain.org https://www.nsf.gov/news/news_summ.jsp?cntn_id=122537
11. a modern superglue computing
language for science
‣ high-level scripting language
‣ open source, huge & growing community in
academia & industry
‣ Just in time compilation but also fast numerical
computation
‣ Extensive interfaces to 3rd party frameworks
A reasonable lingua franca for scientists...
13. ‣ 3 days of live/archive streamed lectures
‣ all open material in GitHub
‣ widely disseminated (e.g., @ NASA)
‣ funded (~$18k) by the Vice Chancellor for Research
& NSF (BIGDATA)
http://pythonbootcamp.info
14. Part of the
Designated
Emphasis in
Computation
al Science &
Engineering
at Berkeley
visualization
machine learning
database interaction
user interface & web frameworks
timeseries & numerical
computing
interfacing to other languages
Bayesian inference & MCMC
hardware control
parallelism
16. Time domain preprocessing
- Start with raw photometry!
- Gaussian process detrending!
- Calibration!
- Petigura & Marcy 2012!
!
Transit search
- Matched filter!
- Similar to BLS algorithm (Kovcas+ 2002)!
- Leverages Fast-Folding Algorithm
O(N^2) → O(N log N) (Staelin+ 1968)!
!
Data validation
- Significant peaks in periodogram, but
inconsistent with exoplanet transit
TERRA – optimized for small planets
Detrended/calibrated photometry
TERRA
RawFlux(ppt)CalibratedFlux
Erik Petigura
Berkeley Astro
Grad Student
Petigura, Howard, & Marcy (20
Prevalence of Earth-size planets orbiting Sun-like stars
Erik A. Petiguraa,b,1
, Andrew W. Howardb
, and Geoffrey W. Marcya
a
Astronomy Department, University of California, Berkeley, CA 94720; and b
Institute for Astronomy, University of Hawaii at Manoa, Honolulu, HI 96822
Contributed by Geoffrey W. Marcy, October 22, 2013 (sent for review October 18, 2013)
Determining whether Earth-like planets are common or rare looms
as a touchstone in the question of life in the universe. We searched
for Earth-size planets that cross in front of their host stars by
examining the brightness measurements of 42,000 stars from
National Aeronautics and Space Administration’s Kepler mission.
We found 603 planets, including 10 that are Earth size (1 − 2 R⊕)
and receive comparable levels of stellar energy to that of Earth
(0:25 − 4 F⊕). We account for Kepler’s imperfect detectability of
such planets by injecting synthetic planet–caused dimmings into
the Kepler brightness measurements and recording the fraction
detected. We find that 11 ± 4% of Sun-like stars harbor an Earth-
size planet receiving between one and four times the stellar inten-
sity as Earth. We also find that the occurrence of Earth-size planets is
constant with increasing orbital period (P), within equal intervals of
logP up to ∼200 d. Extrapolating, one finds 5:7+1:7
−2:2 % of Sun-like stars
harbor an Earth-size planet with orbital periods of 200–400 d.
extrasolar planets | astrobiology
The National Aeronautics and Space Administration’s (NASA’s)
Kepler mission was launched in 2009 to search for planets
that transit (cross in front of) their host stars (1–4). The resulting
dimming of the host stars is detectable by measuring their bright-
ness, and Kepler monitored the brightness of 150,000 stars every
30 min for 4 y. To date, this exoplanet survey has detected more
than 3,000 planet candidates (4).
The most easily detectable planets in the Kepler survey are
those that are relatively large and orbit close to their host stars,
especially those stars having lower intrinsic brightness fluctua-
tions (noise). These large, close-in worlds dominate the list of
known exoplanets. However, the Kepler brightness measurements
can be analyzed and debiased to reveal the diversity of planets,
We searched for transiting planets in Kepler brightness mea-
surements using our custom-built TERRA software package
described in previous works (6, 9) and in SI Appendix. In brief,
TERRA conditions Kepler photometry in the time domain, re-
moving outliers, long timescale variability (>10 d), and systematic
errors common to a large number of stars. TERRA then searches
for transit signals by evaluating the signal-to-noise ratio (SNR) of
prospective transits over a finely spaced 3D grid of orbital period,
P, time of transit, t0, and transit duration, ΔT. This grid-based
search extends over the orbital period range of 0.5–400 d.
TERRA produced a list of “threshold crossing events” (TCEs)
that meet the key criterion of a photometric dimming SNR ratio
SNR > 12. Unfortunately, an unwieldy 16,227 TCEs met this cri-
terion, many of which are inconsistent with the periodic dimming
profile from a true transiting planet. Further vetting was performed
by automatically assessing which light curves were consistent with
theoretical models of transiting planets (10). We also visually
inspected each TCE light curve, retaining only those exhibiting a
consistent, periodic, box-shaped dimming, and rejecting those
caused by single epoch outliers, correlated noise, and other data
anomalies. The vetting process was applied homogeneously to all
TCEs and is described in further detail in SI Appendix.
To assess our vetting accuracy, we evaluated the 235 Kepler
objects of interest (KOIs) among Best42k stars having P > 50 d,
which had been found by the Kepler Project and identified as planet
candidates in the official Exoplanet Archive (exoplanetarchive.
ipac.caltech.edu; accessed 19 September 2013). Among them, we
found four whose light curves are not consistent with being
planets. These four KOIs (364.01, 2,224.02, 2,311.01, and 2,474.01)
have long periods and small radii (SI Appendix). This exercise
suggests that our vetting process is robust and that careful scrutiny
of the light curves of small planets in long period orbits is useful to
identify false positives.
ASTRONOMY
Bootcamp/
Seminar Alum
Python
DOE/NERSC computation
PNAS [2014]
17. “Are we alone in the universe? What makes up the missing mass
of the universe? ... And maybe the biggest question of all: How in
the wide world can you add $3 billion in market capitalization
simply by adding .com to the end of a name?”
President William Jefferson Clinton
Science and Technology Policy Address
21 January 2000
“Add Data Science or Big Data to your course name to increase
enrollment by tenfold.”
Joshua Bloom
Just Now
19. ‣ Where do Bootcamps & Seminars fit into
traditional domain science curricula?
- formal coursework competes with research
obligations for graduate students
‣ Are they too vocational/practical for
higher Ed?
‣ Who should teach them & how do we
credit them?
20. first this... ...then this.
Undergraduate & Graduate Training Mission
Thinking Data Literacy before
Thinking Big Data Proficiency
21. Undergraduate & Graduate Training Mission
Thinking Data Literacy before
Thinking Big Data Proficiency
Data analysis recipes:
Fitting a model to data⇤
David W. Hogg
Center for Cosmology and Particle Physics, Department of Physics, New York University
Max-Planck-Institut f¨ur Astronomie, Heidelberg
Jo Bovy
Center for Cosmology and Particle Physics, Department of Physics, New York University
Dustin Lang
Department of Computer Science, University of Toronto
Princeton University Observatory
Abstract
We go through the many considerations involved in fitting a model
to data, using as an example the fit of a straight line to a set of points
in a two-dimensional plane. Standard weighted least-squares fitting
is only appropriate when there is a dimension along which the data
points have negligible uncertainties, and another along which all the
uncertainties can be described by Gaussians of known variance; these
34 Fitting a straight line to data
0 50 100 150 200 250 300
x
0
100
200
300
400
500
600
700
y
y = 1.33 x + 164
0 50 100 150 200 250 300
x
0
100
200
300
400
500
600
700
y
0.0
1.0
1.0
1.0
0.0
0.0
0.0
0.00.0
0.0
0.0
0.0
0.00.0
0.0
0.0
0.0
0.0
0.0
0.0
D
Fi
Da
Cen
Ma
Jo
Cen
Du
Dep
Pri
tha
dit
lin
and
arXiv:1008.4686v1 [astro-ph.IM] 27 Aug 2010
Statistical Inference
22. Versioning & Reproducibility
“Recently, the scientific community was shaken by reports that
a troubling proportion of peer-reviewed preclinical studies are
not reproducible.” McNutt, 2014
http://www.sciencemag.org/content/343/6168/229.summary
- Git has emerged as the de facto versioning tool
- Berkeley Common Environment (BCE) Software Stack
- “Reproducible and Collaborative Statistical Data
Science” (Statistics 157: P. Stark)
- Next up: Versioning (big) data?
Undergraduate & Graduate Training Mission
Thinking Data Literacy before
Thinking Big Data Proficiency
26. Established CS/Stats/Math in Service
of novelty in domain science
vs.
Novelty in domain science driving &
informing novelty in CS/Stats/Math
“novelty2 problem”
Extra Burden for Forefront Scientists
https://medium.com/tech-talk/dd88857f662
27. Berkeley Institute for Data Sciences (BIDS)
‣ Physical Space & New Entity dedicated
to the Moore/Sloan Data Science
principles
‣ Goal: rich resource and ecosystem for
domain scientists to connect &
collaborate with methodologists
http://bitly.com/bundles/fperezorg/1
“Bold new partnership launches to harness potential of data scientists and big data”
30. Towards an Inclusive Ecosystem
Expanding Participation Among
Underrepresented Groups
11%
56%
33%
female
male
decline to state
2013 Python
bootcamp
- 2013 AMP Camp: < 5% women at
- This Workshop: 2 women out of 22 speakers
- 2013 Python Seminar: 36% women
31. Summary
‣ Data Literacy before Big Data Proficiency
‣ Domain Science increasingly dependent
upon methodological competencies
‣ Higher-Ed Role of such training still TBD
• formal courses competes for time
‣ Need to create inclusive, collaborative
environments bridging domains & methodologies
“Training Students to Extract Value from Big Data” National Academies of Science, DC 11 April 2014@profjsb