SlideShare a Scribd company logo
A Study on Handwritten Chinese Character Synthesis
using Generative Networks
Graduate School of System and Engineering
Advisor: Hidemoto Nakada
Author: Liangyu Liu
Overview
• The challenge of a large amount of characters, fonts
and complex structure for Chinese characters.
• Zi2Zi is a powerful model for printed type Chinese
character synthesis, while underwhelming for
handwritings generation.
Overview
Badly formed
handwritings
Overview
• We aim at improving zi2zi on Chinese handwriting
character synthesis using three new training
methods.
• Result shows that using our methods can get a
better image quality and learning from easy to hard
tasks using curriculum learning can improve
learning effect.
Overview
Badly formed
handwritings
01 Background
03 Experiment and
Result
Contents
02 Method
04 Conclusion
1 Background
Background
• Numerous characters.
• Complex structure.
• A few handwriting samples for a certain font.
• A sample shows the difference between Chinese
characters and alphabets.
Challenges of Chinese
character synthesis
Background
• Re-purpose a well-trained
model onto another related
task.
• Train faster and more effectively.
• Improve performance when
tackling another related task by
fine-tuning.
Transfer Learning
Taken from https://ruder.io/transfer-learning/index.html
Background
• Based on conditional GAN for image to image transition.
• Main loss functions: Adversarial loss and L1 loss.
• Discriminator to distinguish
whether the images are
real or fake.
• Generator to synthesize
more realistic images.
Pix2Pix
Taken from Image-to-Image Translation with Conditional Adversarial Networks
https://arxiv.org/abs/1611.07004
Background
• An encoder-decoder model
based on the fully convolutional
network.
• Skip-connections between the
encoding layers and the
corresponding decoding layers.
U-net
Taken from U-Net: Convolutional Networks for Biomedical Image Segmentation
https://arxiv.org/pdf/1505.04597
Background
• Based on pix2pix model for Chinese characters synthesis.
• Learn multiple fonts at the same time.
• A non-trainable Gaussian noise to map the corresponding style.
• A multi-class category loss to avoid confusion of multiple styles
by predicting the style of the character.
• A constant loss to make the generated character resemble the
source.
• Good result in printed font synthesis while underwhelming on
generating handwritings.
Zi2Zi
Badly formed
result
Background
Architecture of Zi2Zi
• Main loss functions:
• G loss, calculated by L1 loss,
category loss and constant
loss.
• D loss, calculated by true or
fake loss and category loss.
• L1 loss to measure the
difference between
generated and real images.
Background
• Training process:
• Prepare plenty of fonts, sampling
characters from each font and draw
with the source font in pair.
• Training all the paired images
together at the same time.
• Test process:
• Select the test font by designating
the embedding id.
Process of Zi2Zi
Paired images for
different fonts
Background
• Evaluate the quality and similarity of images.
• SSIM (Structural Similarity Index) :
• Perceived similarity between the two given images.
• From 0 to 1
• The larger the more similar.
• PSNR (Peak Signal to Noise Ratio) :
• The quality of generated image.
• The larger the clearer.
Image quality assessment
2 Method
Method
• Different stroke weight between printed type,
hard-pen and brush calligraphy.
• Same component (radical) showed many times
in one font.
• More personal stylized fonts.
Hypothesis of causes for badly
formed handwritings by Zi2Zi
Badly formed
samples
Method
1. Using all hard-pen handwritings
for training.
2. Reducing the characters that have
common radical in the same font.
3. Training with less stylized fonts
and fine-tune with more stylized
fonts.
Training methods
Hard-
pen
Printed
type
Brush
calligraphy
Method
• Different stroke weights between each
font have a critical influence on training.
• Better to concentrate on learning a
similar type of handwritings.
• Hard-pen handwritings tend to have
clear structure because of their light
stroke weight.
1. Using all hard-pen handwritings
Hard-
pen
Printed
type
Brush
calligraph
y
Method
2.Reducing the characters
that have common radical
• Same radical in different characters
looks similar in same hard-pen
handwriting font, while differs from
other hard-pen handwriting font.
• Learning an excessively
repeated radical in the same font
are less effective than learning
other totally different characters.
Font I
Font II
Fonts
Samples
Same
radical
Same
radical
Method
• Big differences between handwritings
for personal writing habits.
• Curriculum learning: Learn from easy
to hard tasks.
• Train the more printed like
handwritings and then fine-tune with
more personal stylized handwritings.
3.Training from less stylized fonts
to more stylized fonts
easy
normal
hard
3
Experiment
and Result
Experiment and Result
• Three types of handwriting fonts are prepared: Printed,
hard-pen and bush calligraphy.
• Format of font: TrueType.
• Source font: SIMSUN.
• A sample of SIMSUN in a TrueType file is shown.
Dataset
Experiment and Result
• Experiment I
Experiment design
Use a mixed dataset, including
printed, hard-pen and brush
calligraphy fonts for training.
Mixed
关键词
Use all hard-pen handwriting
fonts for training.
Hard-pen
关键词
Hard-
pen
Printed
type
Brush
calligraphy
Hard-
pen
Test: 5 hard-pen fonts, each font includes 3 characters.
Experiment and Result
• Experiment II
Experiment design
Use 30 fonts, 500 characters in
each font including 25 characters
with common radical 亻.
Original
关键词
Use 50 fonts, 300 characters in
each font including 15 characters
with common radical 亻.
Reduced
关键词
15 亻
285 B
15 亻
285 B
fonts
characters ……
1 50
关键词
25 亻
475 B
25 亻
475 B
fonts
characters ……
1 30
Test: 5 hard-pen fonts, each font includes 5 characters.
Experiment and Result
• Experiment III
Experiment design
Mix 25 printed like handwriting
fonts and 5 more stylized fonts
together for training.
Normal
关键词
Train
together
Test: 5 hard-pen fonts, each font includes 5 characters.
Train with 25 printed like
handwriting fonts and fine-tune
with 5 more stylized fonts.
Easy to hard(e2h)
easy
hard
Fine-tune
Train
Experiment and Result
Result-experiment I
Method SSIM PSNR
Mixed 0.401 8.791
Hard-
pen
0.387 8.861
Loss function Generated samples Evaluation
mixed hard-pen
d_loss
g_loss
L1_loss
truth mixed hard-pen
Experiment and Result
Result-experiment II
Loss function Generated samples Evaluation
original reduced
d_loss
g_loss
L1_loss
truth original reduced
Method SSIM PSNR
Original 0.434 9.643
Reduce
d
0.430 9.912
Experiment and Result
Result-experiment III
Loss function Generated samples Evaluation
train fine-tune
d_loss
g_loss
L1_loss
truth normal e2h
Method SSIM PSNR
Normal 0.541 10.489
E2h 0.553 10.638
Experiment and Result
• Our methods get a higher image quality, the generated samples are less blurred.
• Training from easy to hard tasks using curriculum learning makes sense for
Chinese handwriting character synthesis.
• The similarity are not improved for the first two methods, the reason might be:
• The strokes show more significant differences in handwriting than in mixed
type fonts.
• The selected radical has fewer strokes and simpler structure than other
components.
Discussion
4 Conclusion
Conclusion
• The initial results show that we can get less blurred generated images
and improve the image quality by using our training methods.
• Learning from easy to hard tasks using curriculum learning is effective
to improve learning effect for zi2zi on Chinese handwriting synthesis
tasks.
• We want to use the model to create TrueType fonts by given only a
few samples and generate all the others, as our future work.
Conclusion and future work

More Related Content

Similar to Presentation for dissertation liu

Game design 2 (2013): Lecture 14 - Revision
Game design 2 (2013): Lecture 14 - RevisionGame design 2 (2013): Lecture 14 - Revision
Game design 2 (2013): Lecture 14 - Revision
David Farrell
 
prepress 1st chapter.pptx
prepress 1st chapter.pptxprepress 1st chapter.pptx
prepress 1st chapter.pptx
IndumeenalR
 
Making Beautiful Books
Making Beautiful BooksMaking Beautiful Books
Making Beautiful Books
dclsocialmedia
 
Power Point Design Keys
Power Point Design KeysPower Point Design Keys
Power Point Design Keys
astridcarol09
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
Anuj Gupta
 
Day01 design final
Day01 design finalDay01 design final
Day01 design final
Brooke Nelson
 
Rob design
Rob designRob design
Rob design
robldrinkard
 
How Not to Piss off Your Printing Press
How Not to Piss off Your Printing PressHow Not to Piss off Your Printing Press
How Not to Piss off Your Printing Press
Swapnil Acharya
 
00667
0066700667
Planning
PlanningPlanning
Planning
Jack Hickman
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translation
khyati gupta
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translation
khyati gupta
 
project present
project presentproject present
project present
khyati gupta
 
AAUP 2015: Fonts in E-Books Panel Outline
AAUP 2015: Fonts in E-Books Panel OutlineAAUP 2015: Fonts in E-Books Panel Outline
AAUP 2015: Fonts in E-Books Panel Outline
Association of University Presses
 
Quality across borders
Quality across bordersQuality across borders
Quality across borders
Rowan_Shaw
 
ACM Init() lesson 1
ACM Init() lesson 1ACM Init() lesson 1
Sequence to sequence model speech recognition
Sequence to sequence model speech recognitionSequence to sequence model speech recognition
Sequence to sequence model speech recognition
Aditya Kumar Khare
 
MLCC Schedule #1
MLCC Schedule #1MLCC Schedule #1
MLCC Schedule #1
Bruce Lee
 
NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP Bootcamp
Anuj Gupta
 
1909 paclic
1909 paclic1909 paclic
1909 paclic
WarNik Chow
 

Similar to Presentation for dissertation liu (20)

Game design 2 (2013): Lecture 14 - Revision
Game design 2 (2013): Lecture 14 - RevisionGame design 2 (2013): Lecture 14 - Revision
Game design 2 (2013): Lecture 14 - Revision
 
prepress 1st chapter.pptx
prepress 1st chapter.pptxprepress 1st chapter.pptx
prepress 1st chapter.pptx
 
Making Beautiful Books
Making Beautiful BooksMaking Beautiful Books
Making Beautiful Books
 
Power Point Design Keys
Power Point Design KeysPower Point Design Keys
Power Point Design Keys
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
 
Day01 design final
Day01 design finalDay01 design final
Day01 design final
 
Rob design
Rob designRob design
Rob design
 
How Not to Piss off Your Printing Press
How Not to Piss off Your Printing PressHow Not to Piss off Your Printing Press
How Not to Piss off Your Printing Press
 
00667
0066700667
00667
 
Planning
PlanningPlanning
Planning
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translation
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translation
 
project present
project presentproject present
project present
 
AAUP 2015: Fonts in E-Books Panel Outline
AAUP 2015: Fonts in E-Books Panel OutlineAAUP 2015: Fonts in E-Books Panel Outline
AAUP 2015: Fonts in E-Books Panel Outline
 
Quality across borders
Quality across bordersQuality across borders
Quality across borders
 
ACM Init() lesson 1
ACM Init() lesson 1ACM Init() lesson 1
ACM Init() lesson 1
 
Sequence to sequence model speech recognition
Sequence to sequence model speech recognitionSequence to sequence model speech recognition
Sequence to sequence model speech recognition
 
MLCC Schedule #1
MLCC Schedule #1MLCC Schedule #1
MLCC Schedule #1
 
NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP Bootcamp
 
1909 paclic
1909 paclic1909 paclic
1909 paclic
 

Recently uploaded

Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
Mahmoud Morsy
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
Roger Rozario
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
ydzowc
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
21UME003TUSHARDEB
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
PKavitha10
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
RamonNovais6
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
abbyasa1014
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
BRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdfBRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdf
LAXMAREDDY22
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
Nada Hikmah
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 

Recently uploaded (20)

Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
BRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdfBRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdf
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 

Presentation for dissertation liu

  • 1. A Study on Handwritten Chinese Character Synthesis using Generative Networks Graduate School of System and Engineering Advisor: Hidemoto Nakada Author: Liangyu Liu
  • 2. Overview • The challenge of a large amount of characters, fonts and complex structure for Chinese characters. • Zi2Zi is a powerful model for printed type Chinese character synthesis, while underwhelming for handwritings generation. Overview Badly formed handwritings
  • 3. Overview • We aim at improving zi2zi on Chinese handwriting character synthesis using three new training methods. • Result shows that using our methods can get a better image quality and learning from easy to hard tasks using curriculum learning can improve learning effect. Overview Badly formed handwritings
  • 4. 01 Background 03 Experiment and Result Contents 02 Method 04 Conclusion
  • 6. Background • Numerous characters. • Complex structure. • A few handwriting samples for a certain font. • A sample shows the difference between Chinese characters and alphabets. Challenges of Chinese character synthesis
  • 7. Background • Re-purpose a well-trained model onto another related task. • Train faster and more effectively. • Improve performance when tackling another related task by fine-tuning. Transfer Learning Taken from https://ruder.io/transfer-learning/index.html
  • 8. Background • Based on conditional GAN for image to image transition. • Main loss functions: Adversarial loss and L1 loss. • Discriminator to distinguish whether the images are real or fake. • Generator to synthesize more realistic images. Pix2Pix Taken from Image-to-Image Translation with Conditional Adversarial Networks https://arxiv.org/abs/1611.07004
  • 9. Background • An encoder-decoder model based on the fully convolutional network. • Skip-connections between the encoding layers and the corresponding decoding layers. U-net Taken from U-Net: Convolutional Networks for Biomedical Image Segmentation https://arxiv.org/pdf/1505.04597
  • 10. Background • Based on pix2pix model for Chinese characters synthesis. • Learn multiple fonts at the same time. • A non-trainable Gaussian noise to map the corresponding style. • A multi-class category loss to avoid confusion of multiple styles by predicting the style of the character. • A constant loss to make the generated character resemble the source. • Good result in printed font synthesis while underwhelming on generating handwritings. Zi2Zi Badly formed result
  • 11. Background Architecture of Zi2Zi • Main loss functions: • G loss, calculated by L1 loss, category loss and constant loss. • D loss, calculated by true or fake loss and category loss. • L1 loss to measure the difference between generated and real images.
  • 12. Background • Training process: • Prepare plenty of fonts, sampling characters from each font and draw with the source font in pair. • Training all the paired images together at the same time. • Test process: • Select the test font by designating the embedding id. Process of Zi2Zi Paired images for different fonts
  • 13. Background • Evaluate the quality and similarity of images. • SSIM (Structural Similarity Index) : • Perceived similarity between the two given images. • From 0 to 1 • The larger the more similar. • PSNR (Peak Signal to Noise Ratio) : • The quality of generated image. • The larger the clearer. Image quality assessment
  • 15. Method • Different stroke weight between printed type, hard-pen and brush calligraphy. • Same component (radical) showed many times in one font. • More personal stylized fonts. Hypothesis of causes for badly formed handwritings by Zi2Zi Badly formed samples
  • 16. Method 1. Using all hard-pen handwritings for training. 2. Reducing the characters that have common radical in the same font. 3. Training with less stylized fonts and fine-tune with more stylized fonts. Training methods Hard- pen Printed type Brush calligraphy
  • 17. Method • Different stroke weights between each font have a critical influence on training. • Better to concentrate on learning a similar type of handwritings. • Hard-pen handwritings tend to have clear structure because of their light stroke weight. 1. Using all hard-pen handwritings Hard- pen Printed type Brush calligraph y
  • 18. Method 2.Reducing the characters that have common radical • Same radical in different characters looks similar in same hard-pen handwriting font, while differs from other hard-pen handwriting font. • Learning an excessively repeated radical in the same font are less effective than learning other totally different characters. Font I Font II Fonts Samples Same radical Same radical
  • 19. Method • Big differences between handwritings for personal writing habits. • Curriculum learning: Learn from easy to hard tasks. • Train the more printed like handwritings and then fine-tune with more personal stylized handwritings. 3.Training from less stylized fonts to more stylized fonts easy normal hard
  • 21. Experiment and Result • Three types of handwriting fonts are prepared: Printed, hard-pen and bush calligraphy. • Format of font: TrueType. • Source font: SIMSUN. • A sample of SIMSUN in a TrueType file is shown. Dataset
  • 22. Experiment and Result • Experiment I Experiment design Use a mixed dataset, including printed, hard-pen and brush calligraphy fonts for training. Mixed 关键词 Use all hard-pen handwriting fonts for training. Hard-pen 关键词 Hard- pen Printed type Brush calligraphy Hard- pen Test: 5 hard-pen fonts, each font includes 3 characters.
  • 23. Experiment and Result • Experiment II Experiment design Use 30 fonts, 500 characters in each font including 25 characters with common radical 亻. Original 关键词 Use 50 fonts, 300 characters in each font including 15 characters with common radical 亻. Reduced 关键词 15 亻 285 B 15 亻 285 B fonts characters …… 1 50 关键词 25 亻 475 B 25 亻 475 B fonts characters …… 1 30 Test: 5 hard-pen fonts, each font includes 5 characters.
  • 24. Experiment and Result • Experiment III Experiment design Mix 25 printed like handwriting fonts and 5 more stylized fonts together for training. Normal 关键词 Train together Test: 5 hard-pen fonts, each font includes 5 characters. Train with 25 printed like handwriting fonts and fine-tune with 5 more stylized fonts. Easy to hard(e2h) easy hard Fine-tune Train
  • 25. Experiment and Result Result-experiment I Method SSIM PSNR Mixed 0.401 8.791 Hard- pen 0.387 8.861 Loss function Generated samples Evaluation mixed hard-pen d_loss g_loss L1_loss truth mixed hard-pen
  • 26. Experiment and Result Result-experiment II Loss function Generated samples Evaluation original reduced d_loss g_loss L1_loss truth original reduced Method SSIM PSNR Original 0.434 9.643 Reduce d 0.430 9.912
  • 27. Experiment and Result Result-experiment III Loss function Generated samples Evaluation train fine-tune d_loss g_loss L1_loss truth normal e2h Method SSIM PSNR Normal 0.541 10.489 E2h 0.553 10.638
  • 28. Experiment and Result • Our methods get a higher image quality, the generated samples are less blurred. • Training from easy to hard tasks using curriculum learning makes sense for Chinese handwriting character synthesis. • The similarity are not improved for the first two methods, the reason might be: • The strokes show more significant differences in handwriting than in mixed type fonts. • The selected radical has fewer strokes and simpler structure than other components. Discussion
  • 30. Conclusion • The initial results show that we can get less blurred generated images and improve the image quality by using our training methods. • Learning from easy to hard tasks using curriculum learning is effective to improve learning effect for zi2zi on Chinese handwriting synthesis tasks. • We want to use the model to create TrueType fonts by given only a few samples and generate all the others, as our future work. Conclusion and future work

Editor's Notes

  1. Hello everyone. Today I want to introduce our study on Handwritten Chinese Character Synthesis using Generative Networks 
  2. Here is a brief introduction of our work We focus on a Chinese handwriting characters synthesis task. This task is thought to be challenging because of a large amount of characters and fonts in Chinese, and also the complex structure for most Chinese characters. In our previous research, we found a generative model called zi2zi get a good result in generating printed type Chinese characters but underwhelming in handwriting synthesis.
  3. We aim at improving this model on handwriting synthesis tasks, by proposing three new training methods. The initial result shows that we can improve the image quality using our methods. And learning from easy to hard tasks using curriculum learning can also improve learning effect.
  4. Here are the contents of this presentation.
  5. Let’s begin with the first part, the background
  6. We summarize the challenges of Chinese character synthesis as show. Because of the numerous characters and complex structure of Chinese characters, it is impossible for a font designer to design all the characters. Generally, they only design a few characters, for a poster or their artwork. But sometimes, we are interested in what other characters would be like. Dealing with such issue, we assumes that we can use a generative model to infer all the other characters by given a few samples of a certain font. Meanwhile, there are a large amount of fonts for Chinese, so we also expect our model can learn multiple fonts at the same time.
  7. Next, I want to talk about transfer learning. The main method of transfer learning is as the figure shows, we transfer the knowledge accumulated from past training and apply it to a different problem. Basically, we re-purpose a well-trained model onto another related tasks. In this way we can effectively reduce the training time and source cost. Moreover, we can improve the performance of the original model by fine-tuning
  8. Pix2pix is a conditional GAN based model, the generation of the output image is conditional on the source image. There are 2 main loss functions in pix2pix, an adversarial loss and a L1 loss, adversarial loss is provided from the discriminator when it makes the determination. While L1 loss measures between the generated image and the expected output image. Both the loss functions encourage the generator to generate more realistic images. During training, both the generated and source images are provided to discriminator, and discriminator is trained to distinguish whether the images are real or fake. And Generator is trained to synthesize more realistic images.
  9. The most distinctive part of pix2pix is generator, it is constructed as an encoder-decoder model with a novel architecture named U-Net. In a U-net model, The skip-connections added between encoding layers and decoding layers are corresponding. U-Net takes a source image, encodes the input image and down-sampling to a bottleneck layer, after that it decodes the bottleneck representation by up-sampling to generate the target image. the encoder path and the decoder path are like the left and right side of capital U.
  10. z2z is based on pix2pix model for Chinese character synthesis. The greatest function of zi2zi is learning multiple fonts at the same time. To achieve this, they apply a non-trainable Gaussian noise as style embedding to map the characters and their corresponding styles. To avoid the model confusing and mixing the styles together, they use a multi-class category loss to predict the style of the character. ########supervise the discriminator to penalize such scenario Considering that the generated character should resemble the source, they also add a constant loss. #########they must appear close to each other in the embedded space as well, to narrow down the possible search space They get a good result in printed font synthesis. however, there are many badly formed samples when generating handwritings. ###the embedding is to shrink the network, by dimensionality reduction. for zi2zi, the binaries of characters are sparse, we need more layers if we don’t use embedding.
  11. This is the architecture of zi2zi,we can see the embedding and many loss functions there. The loss functions can be divided into three kinds. G loss is for generator, D loss is for discriminator and L1 loss is to measure the difference between generated and real images.
  12. This is the process of zi2zi for training and test. We prepare a lot of fonts, designate the quantity of characters for each fonts and draw them with the source font in pair. Then we can train all these paired images together. For test. We Select the test font by designating the embedding id
  13. For IQA We use SSIM and PSNR to evaluate the similarity and quality of generated images. Structural Similarity Index measures the perceived similarity between the two given images. The range of SSIM is from 0 to 1, where a larger value means the two images are more similar. PSNR measures the image based on the quality of generated image. If there are lots of noises or blurs, the value will be small.
  14. next I want to introduce about the method
  15. The badly formed samples using original training method for zi2zi are as the figure shows. After we analyzing the dataset and generated samples, we give our hypothesis of causes as follows 1st, we assume that the stroke weights have a critical effect on training. If the stroke weights differ a lot, the generated characters will become blurred (like the 2nd char) we also consider that the same component (radical) showed too many times in one font. And this may reduce the learning effect on that font. Besides, we find so many characters are with strong personal style in handwritings, which are hard to recognize.
  16. To test our hypothesis, we proposed three new training methods. The first is to
  17. The first method I want to introduce is using all hard-pen handwritings We found that the same stroke with different weight will show a big difference in structure. Basically, the characters written by brush tend to have heavier strokes than those by hard-pen. If we train them together, the model might learn the wrong formed structure So we assume that it is better to concentrate on learning a similar type of handwritings at the same time. We also think that if the strokes are light, the learning effect is considered to be better, because the characters written in light strokes tend to have clear structure, which make it easier for model to distinguish which the radical is shown and which part the stroke belongs to, even the character has a lot of strokes. ############a radical written in heavy stroke might belong to other part and thus the model would learn the wrong formed structure. This kind of misleading might be the main reason of blur in some generated characters.
  18. The 2nd method I want to introduce is Reducing the characters that have common radical We found the same radical in different characters looks similar in same hard-pen handwriting font, but differs from other hard-pen handwriting font We assume that if we learn too many characters with the same radical in one font, it would be less effective than learning entirely different characters instead. We give an extreme case to explain it more concretely. For case one all the chosen characters are with the same radical In font A, while for case 2, each font only have one character with the same radical. We consider that the case II has a better learning effect than case 1
  19. The 3nd method I want to introduce is Training from less stylized fonts to more stylized fonts There are Big differences between handwritings for their personal writing habits, as the figure shows. the characters upwards have a clearer structure and are easy to identify, more like printed font, while the downwards are more stylized and not so easy to recognize. Inspired by the common sense that learning from easy to hard knowledge will get a better learning effect. we propose a training method, begin the training with more printed like handwriting at first and then fine-tune with the more stylized handwriting. This is also known as curriculum learning. We assume that this method would be also effective on improving the learning effect.
  20. Then let's talk about the experiment and result
  21. For the dataset, we mainly use three types of handwriting font, including printed, hard-pen, and bush calligraphy. for each type 40 fonts are prepared. The original format of the dataset is TrueType (.ttf), an outline font standard. it gives a standard style for all the characters included in it. We can choose any characters in it and draw them down as image. In our experiment, we choose SIMSUN as source font, the glyph is as the figure shows.
  22. To test our hypothesis, we carry out 3 comparison experiments For the first experiment, we prepared two types of training set, one includes only hard-pen handwriting fonts for training model, compared with another model trained with a mixed dataset, including printed, hard-pen and brush calligraphy fonts. We give a shorten name for the first method as "hard-pen", and the 2nd named as "mixed".
  23. For the 2nd experiment we similarly prepared two types of training set, as Figure shows. For the first training set, we use 50 fonts and choose 300 characters from each font, each font includes 15 characters that contain the common radical, while others are not. For another training set, we use 30 fonts and choose 500 characters from each font, where 25 characters with common radical. Then we carry out our experiment to compare these two training methods. We give a shorten name for the first method as “reduced", and the 2nd named as “original".
  24. For the 3rd experiment we choose 25 hard-pen handwriting fonts that are easy to be identified as the easier tasks. Also prepare another 5 hard-pen handwriting fonts which are difficult to recognize as the harder task, we first train with the easier task and then fine-tune with the harder one. We named this method as "e2h". To compared with this method, we combined all the 30 fonts together, training another model with the same epochs. We named the second method as “normal”.
  25. To display our results the generated samples and the evaluation For each experiment. ##we will show the plots of loss function For the 1st experiment, (From the evaluation metric, )we find that hard-pen got a better image quality, while not got a higher similarity.
  26. For the 2nd experiment, (From the evaluation metric) we can find that that the reduced also got a better image quality, while not improved the similarity.
  27. For the 3rd experiment, (From the evaluation metric) we can find that e2h got a better image quality and similarity.
  28. We give a brief discussion on the shown results. We found that All the three methods get a higher image quality by PSNR value. The generated samples of ours are less blurred. Training from easy to hard tasks using curriculum learning makes sense for Chinese handwriting character synthesis. However the first 2 methods did not improve the similarity, for the first method, we assume the reason might be that The strokes show more significant differences in handwriting than in mixed type fonts for the second method, The selected radical has fewer strokes and simpler structure than other components.
  29. According to the experiment results and discussion, main conclusions are listed as follow: