The VoiceMOS Challenge 2022 aimed to encourage research in automatic prediction of mean opinion scores (MOS) for speech quality. It featured two tracks evaluating systems' ability to predict MOS ratings from a large existing dataset or a separate listening test. 21 teams participated in the main track and 15 in the out-of-domain track. Several teams outperformed the best baseline, which fine-tuned a self-supervised model, though the top-performing approaches generally involved ensembling or multi-task learning. While unseen systems were predictable, unseen listeners and speakers remained a difficulty, especially for generalizing to a new test. The challenge highlighted progress in MOS prediction but also the need for metrics reflecting both ranking and absolute accuracy
Recent progress on voice conversion: What is next?NU_I_TODALAB
Invited Talk at IEEE SLT 2021
Title: "Recent progress on voice conversion: What is next?"
Speaker: Tomoki Toda
Toda Laboratory, Department of Intelligent Systems, Graduate School of Informatics, Nagoya University
Statistical voice conversion with direct waveform modelingNU_I_TODALAB
Lecture slides by Tomoki Toda
Tutorial [T2] at INTERSPEECH 2019
Title: "Statistical voice conversion with direct waveform modeling"
Lecturers: Tomoki Toda, Kazuhiro Kobayashi, Tomoki Hayashi
Toda Laboratory, Department of Intelligent Systems, Graduate School of Informatics, Nagoya University
Recent progress on voice conversion: What is next?NU_I_TODALAB
Invited Talk at IEEE SLT 2021
Title: "Recent progress on voice conversion: What is next?"
Speaker: Tomoki Toda
Toda Laboratory, Department of Intelligent Systems, Graduate School of Informatics, Nagoya University
Statistical voice conversion with direct waveform modelingNU_I_TODALAB
Lecture slides by Tomoki Toda
Tutorial [T2] at INTERSPEECH 2019
Title: "Statistical voice conversion with direct waveform modeling"
Lecturers: Tomoki Toda, Kazuhiro Kobayashi, Tomoki Hayashi
Toda Laboratory, Department of Intelligent Systems, Graduate School of Informatics, Nagoya University
Interactive voice conversion for augmented speech productionNU_I_TODALAB
Invited Talk at SNL 2021
Title: "Interactive voice conversion for augmented speech production"
Speaker: Tomoki Toda
Toda Laboratory, Department of Intelligent Systems, Graduate School of Informatics, Nagoya University
GAN-based statistical speech synthesis (in Japanese)Yuki Saito
Guest presentation at "Applied Gaussian Process and Machine Learning," Graduate School of Information Science and Technology, The University of Tokyo, Japan, 2021.
Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...NU_I_TODALAB
APSIPA ASC 2021
Ding Ma, Wen-Chin Huang, Tomoki Toda: Investigation of text-to-speech-based synthetic parallel data for sequence-to-sequence non-parallel voice conversion, Dec. 2021
Toda Laboratory, Department of Intelligent Systems, Graduate School of Informatics, Nagoya University
Presentation for Interspeech 2022: "The VoiceMOS Challenge 2022"
Presenter: Dr. Erica Cooper, National Institute of Informatics
Preprint: https://arxiv.org/abs/2203.11389
Video: https://youtu.be/99ZQ-SLUvKE
Challenge website: https://voicemos-challenge-2022.github.io
Thu-SS-OS-9-5
We present the first edition of the VoiceMOS Challenge, a scientific event that aims to promote the study of automatic prediction of the mean opinion score (MOS) of synthetic speech. This challenge drew 22 participating teams from academia and industry who tried a variety of approaches to tackle the problem of predicting human ratings of synthesized speech. The listening test data for the main track of the challenge consisted of samples from 187 different text-to-speech and voice conversion systems spanning over a decade of research, and the out-of-domain track consisted of data from more recent systems rated in a separate listening test. Results of the challenge show the effectiveness of fine-tuning self-supervised speech models for the MOS prediction task, as well as the difficulty of predicting MOS ratings for unseen speakers and listeners, and for unseen systems in the out-of-domain setting.
Interactive voice conversion for augmented speech productionNU_I_TODALAB
Invited Talk at SNL 2021
Title: "Interactive voice conversion for augmented speech production"
Speaker: Tomoki Toda
Toda Laboratory, Department of Intelligent Systems, Graduate School of Informatics, Nagoya University
GAN-based statistical speech synthesis (in Japanese)Yuki Saito
Guest presentation at "Applied Gaussian Process and Machine Learning," Graduate School of Information Science and Technology, The University of Tokyo, Japan, 2021.
Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...NU_I_TODALAB
APSIPA ASC 2021
Ding Ma, Wen-Chin Huang, Tomoki Toda: Investigation of text-to-speech-based synthetic parallel data for sequence-to-sequence non-parallel voice conversion, Dec. 2021
Toda Laboratory, Department of Intelligent Systems, Graduate School of Informatics, Nagoya University
Presentation for Interspeech 2022: "The VoiceMOS Challenge 2022"
Presenter: Dr. Erica Cooper, National Institute of Informatics
Preprint: https://arxiv.org/abs/2203.11389
Video: https://youtu.be/99ZQ-SLUvKE
Challenge website: https://voicemos-challenge-2022.github.io
Thu-SS-OS-9-5
We present the first edition of the VoiceMOS Challenge, a scientific event that aims to promote the study of automatic prediction of the mean opinion score (MOS) of synthetic speech. This challenge drew 22 participating teams from academia and industry who tried a variety of approaches to tackle the problem of predicting human ratings of synthesized speech. The listening test data for the main track of the challenge consisted of samples from 187 different text-to-speech and voice conversion systems spanning over a decade of research, and the out-of-domain track consisted of data from more recent systems rated in a separate listening test. Results of the challenge show the effectiveness of fine-tuning self-supervised speech models for the MOS prediction task, as well as the difficulty of predicting MOS ratings for unseen speakers and listeners, and for unseen systems in the out-of-domain setting.
Chapter 9: Evaluation techniques
from
Dix, Finlay, Abowd and Beale (2004).
Human-Computer Interaction, third edition.
Prentice Hall. ISBN 0-13-239864-8.
http://www.hcibook.com/e3/
MediaEval 2016 - UNIFESP Predicting Media Interestingness Taskmultimediaeval
Presenter: Samuel G. Fadel
UNIFESP at MediaEval 2016: Predicting Media Interestingness Task In Working Notes Proceedings of the MediaEval 2016 Workshop, Hilversum, Netherlands, October 20-21, CEUR-WS.org (2016) by Jurandy Almeida
Paper: http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_28.pdf
Video: https://youtu.be/YLthKNczlcA
Abstract: This paper describes the approach proposed by UNIFESP for the MediaEval 2016 Predicting Media Interestingness Task and for its video subtask only. The proposed approach is based on combining learning-to-rank algorithms for predicting the interestingness of videos by their visual content.
Best Practices in Recommender System ChallengesAlan Said
Recommender System Challenges such as the Netflix Prize, KDD Cup, etc. have contributed vastly to the development and adoptability of recommender systems. Each year a number of challenges or contests are organized covering different aspects of recommendation. In this tutorial and panel, we present some of the factors involved in successfully organizing a challenge, whether for reasons purely related to research, industrial challenges, or to widen the scope of recommender systems applications.
Weakly-Supervised Sound Event Detection with Self-AttentionNU_I_TODALAB
IEEE ICASSP 2020
Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Weakly-supervised sound event detection with self-attention, May 2020
Toda Laboratory, Department of Intelligent Systems, Graduate School of Informatics, Nagoya University
2018 Speech Processing Courses in Crete (SPCC2018)
"Toawrds flexible and intelligible end-to-end speech synthesis systems"
Hands-on slides
Tomoki Toda: Hands on Voice Conversion, July 26, 2018
Toda Laboratory, Department of Intelligent Systems, Graduate School of Informatics, Nagoya University
2018 Speech Processing Courses in Crete (SPCC2018)
"Toawrds flexible and intelligible end-to-end speech synthesis systems"
Lecture slides
Tomoki Toda: Advanced Voice Conversion, July 26, 2018
Toda Laboratory, Department of Intelligent Systems, Graduate School of Informatics, Nagoya University
Missing Component Restoration for Masked Speech Signals based on Time-Domain ...NU_I_TODALAB
IEEE International Workshop on Machine Learning for Signal Processing (MLSP2017)
Nominated For Best Student Paper Award (student: Shogo Seki)
Shogo Seki, Hirokazu Kameoka, Tomoki Toda, Kazuya Takeda: Missing Component Restoration for Masked Speech Signals based on Time-Domain Spectrogram Factorization,Sep. 2017
Toda Laboratory, Department of Intelligent Systems, Graduate School of Informatics, Nagoya University
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Vaccine management system project report documentation..pdfKamal Acharya
The Division of Vaccine and Immunization is facing increasing difficulty monitoring vaccines and other commodities distribution once they have been distributed from the national stores. With the introduction of new vaccines, more challenges have been anticipated with this additions posing serious threat to the already over strained vaccine supply chain system in Kenya.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Automobile Management System Project Report.pdfKamal Acharya
The proposed project is developed to manage the automobile in the automobile dealer company. The main module in this project is login, automobile management, customer management, sales, complaints and reports. The first module is the login. The automobile showroom owner should login to the project for usage. The username and password are verified and if it is correct, next form opens. If the username and password are not correct, it shows the error message.
When a customer search for a automobile, if the automobile is available, they will be taken to a page that shows the details of the automobile including automobile name, automobile ID, quantity, price etc. “Automobile Management System” is useful for maintaining automobiles, customers effectively and hence helps for establishing good relation between customer and automobile organization. It contains various customized modules for effectively maintaining automobiles and stock information accurately and safely.
When the automobile is sold to the customer, stock will be reduced automatically. When a new purchase is made, stock will be increased automatically. While selecting automobiles for sale, the proposed software will automatically check for total number of available stock of that particular item, if the total stock of that particular item is less than 5, software will notify the user to purchase the particular item.
Also when the user tries to sale items which are not in stock, the system will prompt the user that the stock is not enough. Customers of this system can search for a automobile; can purchase a automobile easily by selecting fast. On the other hand the stock of automobiles can be maintained perfectly by the automobile shop manager overcoming the drawbacks of existing system.
Courier management system project report.pdfKamal Acharya
It is now-a-days very important for the people to send or receive articles like imported furniture, electronic items, gifts, business goods and the like. People depend vastly on different transport systems which mostly use the manual way of receiving and delivering the articles. There is no way to track the articles till they are received and there is no way to let the customer know what happened in transit, once he booked some articles. In such a situation, we need a system which completely computerizes the cargo activities including time to time tracking of the articles sent. This need is fulfilled by Courier Management System software which is online software for the cargo management people that enables them to receive the goods from a source and send them to a required destination and track their status from time to time.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Democratizing Fuzzing at Scale by Abhishek Aryaabh.arya
Presented at NUS: Fuzzing and Software Security Summer School 2024
This keynote talks about the democratization of fuzzing at scale, highlighting the collaboration between open source communities, academia, and industry to advance the field of fuzzing. It delves into the history of fuzzing, the development of scalable fuzzing platforms, and the empowerment of community-driven research. The talk will further discuss recent advancements leveraging AI/ML and offer insights into the future evolution of the fuzzing landscape.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
1. The VoiceMOS Challenge 2022
Wen-Chin Huang¹, Erica Cooper², Yu Tsao³, Hsin-Min Wang³, Tomoki Toda¹, Junichi Yamagishi²
¹Nagoya University, Japan
²National Institute of Informatics, Japan
³Academia Sinica, Taiwan
第141回音声言語情報処理
研究発表会 招待講演
2. Outline
I. Introduction
II. Challenge description
A. Tracks and datasets
B. Rules and timeline
C. Evaluation metrics
D. Baseline systems
III. Challenge results
A. Participants demographics
B. Results, analysis and discussion
1. Comparison of baseline systems
2. Analysis of top systems
3. Sources of difficulty
4. Analysis of metrics
IV. Conclusions
4. Important to evaluate speech synthesis systems, ex. text-to-speech (TTS), voice conversion (VC).
Mean opinion score (MOS) test:
Rate quality of individual samples.
Speech quality assessment
Subjective
test
Drawbacks:
1. Expensive: Costs too much time and money.
2. Context-dependent: numbers cannot be meaningfully
compared across different listening tests.
5. Important to evaluate speech synthesis systems, ex. text-to-speech (TTS), voice conversion (VC).
Mean opinion score (MOS) test:
Rate quality of individual samples.
Speech quality assessment
Subjective
test
Objective
assessment
Data-driven MOS prediction
(mostly based on deep learning)
Drawback: bad
generalization ability
due to limited training
data.
6. Goals of the VoiceMOS challenge
*Accepted as an Interspeech 2022 special session!
Encourage research in
automatic data-driven
MOS prediction
Compare different
approaches using
shared datasets and
evaluation
Focus on the
challenging case of
generalizing to a
separate listening test
Promote discussion
about the future of
this research field
9. Tracks and dataset: Main track
The BVCC Dataset
● Samples from 187 different systems all rated together in one listening test
○ Past Blizzard Challenges (text-to-speech synthesis) since 2008
○ Past Voice Conversion Challenges (voice conversion) since 2016
○ ESPnet-TTS (implementations of modern TTS systems), 2020
● Test set contains some unseen systems, unseen listeners, and unseen speakers and is
balanced to match the distribution of scores in the training set
10. Listening test data from the Blizzard Challenge 2019
● “Out-of-domain” (OOD): Data from a completely separate listening test
● Chinese-language synthesis from systems submitted to the 2019 Blizzard Challenge
● Test set has some unseen systems and unseen listeners
● Designed to reflect a real-world setting where a small amount of labeled data is available
● Study generalization ability to a different listening test context
● Encourage unsupervised and semi-supervised approaches using unlabeled data
Tracks and dataset: OOD track
10% 40% 10% 40%
Labeled train set Unlabeled train set Dev set Test set
12. Rules and timeline
Training phase
✔Main track train/dev audio+rating
✔OOD track train/dev audio+rating
Leaderboard available
Need to submit at least once
2021/12/5 2022/2/21 2022/2/28 2022/3/7
Break phase
No submission available.
Conduct result analysis.
Evaluation phase
✔Main track test
audio
✔OOD track test audio
Leaderboard available
Max 3 submissions
Post-evaluation phase
✔Main track test rating
✔OOD track test rating
✔Evaluation results
Submission & leaderboard reopen
General rules
External data is permitted.
All participants need to submit system description.
13. Evaluation metrics
System-level and Utterance-level
● Mean Squared Error (MSE): difference between predicted and actual MOS
● Linear Correlation Coefficient (LCC): a basic correlation measure
● Spearman Rank Correlation Coefficient (SRCC): non-parametric; measures ranking order
● Kendall Tau Rank Correlation (KTAU): more robust to errors
import numpy as np
import scipy.stats
# `true_mean_scores` and `predict_mean_scores` are both 1-d numpy arrays.
MSE = np.mean((true_mean_scores - predict_mean_scores)**2)
LCC = np.corrcoef(true_mean_scores, predict_mean_scores)[0][1]
SRCC = scipy.stats.spearmanr(true_mean_scores, predict_mean_scores)[0]
KTAU = scipy.stats.kendalltau(true_mean_scores, predict_mean_scores)[0]
Following prior work, we
picked system-level SRCC as
the main evaluation metric.
14. Baseline system: SSL-MOS
Fine-tune a self-supervised learning based (SSL) speech model for the MOS prediction task
● Pretrained wav2vec2
● Simple mean pooling and a linear fine-tuning layer
● Wav2vec2 model parameters are updated during fine-tuning
E. Cooper, W.-C. Huang, T. Toda, and J. Yamagishi, “Generalization ability of MOS prediction networks,” in Proc. ICASSP, 2022
15. Baseline system: MOSANet
● Originally developed for noisy speech
assessment
● Cross-domain input features:
○ Spectral information
○ Complex features
○ Raw waveform
○ Features extracted from SSL models
R. E. Zezario, S.-W. Fu, F. Chen, C.-S. Fuh, H.-M. Wang, and Y. Tsao, “Deep Learning-based Non-Intrusive
Multi-Objective Speech Assessment Model with Cross-Domain Features,” arXiv preprint
arXiv:2111.02363, 2021.
16. Baseline system: LDNet
Listener-dependent modeling
● Specialized model structure and inference
method allows making use of multiple ratings
per audio sample.
● No external data is used!
W.-C. Huang, E. Cooper, J. Yamagishi, and T. Toda, “LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech,” in Proc. ICASSP, 2022
17. Challenge results
I. Participants demographics
II. Results, analysis and discussion
A. Comparison of baseline systems
B. Analysis of top systems
C. Sources of difficulty
D. Analysis of metrics
18. Participants demographics
Number of teams: 22 teams + 3 baselines
14 teams are from academia, 5 teams are
from industry, 3 teams are personal
Main track: 21 teams + 3 baselines
OOD track: 15 teams + 3 baselines
Baseline systems:
● B01: SSL-MOS
● B02: MOSANet
● B03: LDNet
20. Comparison of baseline systems: main track
In terms of system-level SRCC,
11 teams outperformed the best
baseline, B01!
However, the gap between the
best baseline and the top
system is not large…
21. Comparison of baseline systems: OOD track
In terms of system-level SRCC,
only 2 teams outperformed or
on par with B01.
The gap is even smaller…
Participant feedback:
“The baseline was too strong!
Hard to get improvement!”
22. Analysis of approaches used
● Main track: Finetuning SSL > using SSL features > not using SSL
● OOD track: finetuned SSL models were both the best and worst systems
● Popular approaches:
○ Ensembling (top team in main track; top 2 teams in OOD track)
○ Multi-task learning
○ Use of speech recognizers (top team in OOD track)
Finetuned SSL
SSL
features
No
SSL
23. Analysis of approaches used
● 7 teams used per-listener ratings
● No teams used listener demographics
○ One team used “listener group”
● OOD track: only 3 teams used the unlabeled data:
Conducted their
own listening test
(top team)
Task-adaptive
pretraining
“Pseudo-label” the
unlabeled data using
trained model
24. Sources of difficulty
Are unseen categories more difficult?
Unseen systems
Unseen speakers
Unseen listeners
Main track OOD track
no
yes (7 teams)
Category
yes (17 teams)
yes (6 teams)
N/A
no
25. Sources of difficulty
Systems with large differences between their training and test set distributions are harder to predict.
27. Analysis of metrics
Why did you use system-level
SRCC as main metric?
Why are there 4 metrics?
There are two main
usage of MOS
prediction models:
Compare a lot of systems
Ex. evaluation in scientific challenges.
→ Ranking-related metrics are preferred
(LCC, SRCC, KTAU)
Evaluate absolute goodness of a system
Ex. use as objective function in training.
→ Numerical metrics are preferred
(MSE)
28. Analysis of metrics
We calculated the linear correlation coefficients between all metrics using main track results.
Correlation coefficients between
ranking based metrics are close to 1.
MSE is different from the other
three metrics.
⇒ Future researchers can consider just reporting 2 metrics: MSE + {LCC, SRCC, KTAU}.
⇒ It is still of significant importance to develop a general metric that reflects both aspects.
30. Conclusions
⇒ Attracted more than
20 participant teams.
⇒ SSL is very
powerful in this task.
⇒ Generalizing to a
different listening test
is still very hard.
⇒ There will be a 2nd,
3rd, 4th,... version!!
The goals of the VoiceMOS challenge:
31. Useful materials
The CodaLab challenge page is still open! Datasets are free to download. VoiceMOS Challenge
The baseline systems are open-sourced!
● https://github.com/nii-yamagishilab/mos-finetune-ssl
● https://github.com/dhimasryan/MOSA-Net-Cross-Domain
● https://github.com/unilight/LDNet
The arXiv paper is available!
● https://arxiv.org/abs/2203.11389