DragGAN is a new AI image editing tool that lets you manipulate images with simple drag controls developed by researchers at the University of California, Berkeley.
It uses generative AI to create realistic changes to the structure and appearance of objects in images. You can also rotate images as if they were 3D models.
The user can then use a drag-and-drop interface to edit the image. DragGAN will then generate a new image that reflects the user's edits.
This document discusses cultural heritage and defines it as the creative expression of a people's existence in the past, near past, and present that tells their traditions, beliefs, and achievements. It notes that cultural heritage includes both tangible and intangible forms. Tangible heritage can be physically touched, like monuments and objects, while intangible heritage includes non-physical forms like music, dance, languages, and traditions. The document emphasizes that cultural heritage is important because it conveys identity and values, is unique, can support economic development, and helps people understand cultural diversity.
clothing and prints of renaissance, baroque and rococo time periodraomomomi
1) The document discusses clothing styles and print motifs during the Renaissance, Rococo, and Baroque periods of art history.
2) During the Renaissance, men and women wore gowns, kirtles, chemises, and other layered clothing. Prints from this period featured colorful patterns like pomegranates.
3) Rococo fashion featured lighter, more ornate styles for both men and women after Louis XIV's death. Prints emphasized natural forms, shells, and pastel colors.
4) Baroque clothing showed influences from the military with elaborate fabrics and patterns. Prints from this period used rich colors like red, green, and blue to depict textures in portraits and floral motifs.
Muga silk is Assam's prized possession, known for its golden color and strength. Sericulture is an important industry in Assam, with muga and eri silks produced traditionally. Women's handloom weaving is a cultural tradition, producing items like the mekhala skirt and patani lower garment. Motifs in weaving depict nature and everyday life. Traditional costumes vary among the Bodo, Dimasa, Mech, and Aitunia tribes of Assam, incorporating locally produced silk and cotton with distinctive styles of dress for men and women. Jewelry like the kopo phul earrings and gaam kharu bangles are notable parts of Assamese adornment.
The document summarizes various traditional arts and crafts from different regions of India, including embroidery techniques like Abla, Bandhani, Batik, and Chikankari. It also describes art forms like Pashmina shawl weaving, Phulkari embroidery, Zardozi embroidery, and styles of painting such as Thanka, Madhubani, Waarli, Tanjore, Kalamkari, and Kangra. Additionally, it outlines crafts involving woodwork, metalwork like Bidri and brassware, silverware, sandalwood carving, cane work, and Sankheda woodwork. Other crafts mentioned are jute weaving, glassware, iv
A Brief Presentation on costumes of Maharashtra including its culture, costume, history of costume, variety, men and women's wear, jewelry, footwear, present scenario.
Romans wore tunics secured with pins and togas for ceremonies. Poorer Romans wore simple short tunics, while women wore long pleated dresses called stolas and could cover their heads with palladiums. Roman women styled their hair in buns and used curling irons. Nobles and emperors had neat clothing - empresses wore light coats, palladiums and stolas while emperors wore wool tunics and large purple togas. Senators wore large tunics indoors and added red palladiums over white togas when outside.
Folk art encompasses art produced from indigenous cultures or by peasants and laborers without formal training. It is made by people from rural areas and passed down through generations. The document then goes on to provide overviews of the folk art traditions of India, Japan, and Africa, including details about their cuisines, dances, music, drama, and folktales.
This document provides an overview of the history and origins of clothing. It discusses that Neanderthal man is believed to be the first humans to make clothing, using animal skins and furs to keep warm. As humans evolved and developed tools like needles and sewing, clothing became more sophisticated with items like tunics, leggings, and fur coats. Theories on the origins of clothing include that it was originally used for modesty, sexual attraction, adornment, and protection from elements and animals. The document also examines the discovery of Otzi the Iceman from 5,300 years ago, whose preserved remains showed he wore a complex outfit of stitched leather garments and fur.
This document discusses cultural heritage and defines it as the creative expression of a people's existence in the past, near past, and present that tells their traditions, beliefs, and achievements. It notes that cultural heritage includes both tangible and intangible forms. Tangible heritage can be physically touched, like monuments and objects, while intangible heritage includes non-physical forms like music, dance, languages, and traditions. The document emphasizes that cultural heritage is important because it conveys identity and values, is unique, can support economic development, and helps people understand cultural diversity.
clothing and prints of renaissance, baroque and rococo time periodraomomomi
1) The document discusses clothing styles and print motifs during the Renaissance, Rococo, and Baroque periods of art history.
2) During the Renaissance, men and women wore gowns, kirtles, chemises, and other layered clothing. Prints from this period featured colorful patterns like pomegranates.
3) Rococo fashion featured lighter, more ornate styles for both men and women after Louis XIV's death. Prints emphasized natural forms, shells, and pastel colors.
4) Baroque clothing showed influences from the military with elaborate fabrics and patterns. Prints from this period used rich colors like red, green, and blue to depict textures in portraits and floral motifs.
Muga silk is Assam's prized possession, known for its golden color and strength. Sericulture is an important industry in Assam, with muga and eri silks produced traditionally. Women's handloom weaving is a cultural tradition, producing items like the mekhala skirt and patani lower garment. Motifs in weaving depict nature and everyday life. Traditional costumes vary among the Bodo, Dimasa, Mech, and Aitunia tribes of Assam, incorporating locally produced silk and cotton with distinctive styles of dress for men and women. Jewelry like the kopo phul earrings and gaam kharu bangles are notable parts of Assamese adornment.
The document summarizes various traditional arts and crafts from different regions of India, including embroidery techniques like Abla, Bandhani, Batik, and Chikankari. It also describes art forms like Pashmina shawl weaving, Phulkari embroidery, Zardozi embroidery, and styles of painting such as Thanka, Madhubani, Waarli, Tanjore, Kalamkari, and Kangra. Additionally, it outlines crafts involving woodwork, metalwork like Bidri and brassware, silverware, sandalwood carving, cane work, and Sankheda woodwork. Other crafts mentioned are jute weaving, glassware, iv
A Brief Presentation on costumes of Maharashtra including its culture, costume, history of costume, variety, men and women's wear, jewelry, footwear, present scenario.
Romans wore tunics secured with pins and togas for ceremonies. Poorer Romans wore simple short tunics, while women wore long pleated dresses called stolas and could cover their heads with palladiums. Roman women styled their hair in buns and used curling irons. Nobles and emperors had neat clothing - empresses wore light coats, palladiums and stolas while emperors wore wool tunics and large purple togas. Senators wore large tunics indoors and added red palladiums over white togas when outside.
Folk art encompasses art produced from indigenous cultures or by peasants and laborers without formal training. It is made by people from rural areas and passed down through generations. The document then goes on to provide overviews of the folk art traditions of India, Japan, and Africa, including details about their cuisines, dances, music, drama, and folktales.
This document provides an overview of the history and origins of clothing. It discusses that Neanderthal man is believed to be the first humans to make clothing, using animal skins and furs to keep warm. As humans evolved and developed tools like needles and sewing, clothing became more sophisticated with items like tunics, leggings, and fur coats. Theories on the origins of clothing include that it was originally used for modesty, sexual attraction, adornment, and protection from elements and animals. The document also examines the discovery of Otzi the Iceman from 5,300 years ago, whose preserved remains showed he wore a complex outfit of stitched leather garments and fur.
The document provides an overview of the ancient Sumerian civilization that emerged in Mesopotamia between the Tigris and Euphrates Rivers. It discusses that during the Neolithic period, settlements developed along the rivers which eventually evolved into complex cities. This led to significant cultural advancements for the Sumerians, including innovations like the wheel, plow, and irrigation canals. The Sumerians were also the first to develop a system of writing known as cuneiform. Sumerian society was composed of independent city-states each ruled by a priest-king and organized hierarchically. The Sumerians practiced a polytheistic religion and constructed massive ziggarat temples to appease the gods they believed
This document provides an introduction to cultural heritage. It defines heritage as anything important passed to future generations. Heritage is divided into natural heritage like landscapes and cultural heritage like traditions. Cultural heritage includes tangible sites and monuments as well as intangible aspects like folklore. Understanding a site's historical, social, aesthetic, and scientific significance helps determine management policies. Cultural identity is nurtured by a country's cultural heritage through understanding tangible sites and cultural behaviors, values, and traditions.
The Ashavali sari originated in Ahmedabad, Gujarat, which was formerly known as Ashaval and was an important textile manufacturing center. The Ashavali sari is known for its rich brocaded patterns woven in twill weave, with intricately woven silken patterns embedded into a gold surface in varied colors, imitating enamel work. Common motifs featured in Ashavali sari borders include parrots, peacocks, lions, doves, trees and flowers. Historical texts refer to Ahmedabad as a famous center for brocades and the Ashavali sari as being highly demanded. The local weaving technique was called desi vanat and produced sar
An Indian traditional craft Phulkari practiced in Punjab from generations. People from all over the love this hand crafted skill by the women in punjab
PRESENTATION ON BALUCHARI SAREE OF WEST BENGALAARTI WADHWA
THERE IS THIS SAYING ABOUT BALUCHARI SAREE "Makur tane kabbyo gaatha baluchari juri kotha";MEANING: weaving poetry and lore with the shuttle, the baluchari is beyond compare. (AND I COMPLETELY AGREE).
The Baluchari Sari has also been granted the status of Geographical indication in India.
Baluchari sarees are preferred for their soft and luxurious hand feel, the richness of the silks used, their fine weave and stylish looks.
The Baluchari sari has won the Presidential award on two occasions for its weaving style and has been prominently displayed in international trade fairs.
HOPE MY SLIDE WILL HELP YOU UNDERSTAND THIS PARTICULAR TEXTILE OF INDIA
This document provides an overview of Gujarat's culture, including its festivals, cuisine, crafts, textiles, costumes, and jewelry. Some key points:
- Gujarat is known for folk dances like garba and raas that are performed during Navratri festivals. Traditional Gujarati food is primarily vegetarian and healthy.
- The state has rich textile traditions like Patola silk sarees, Bandhani tie-dye, embroidery styles like Banni and Kutchi that use mirrors and threadwork.
- Traditional Gujarati costumes vary by region and community but include items like chaniya cholis for women and kediyus for men. Bridal wear
This document provides summaries of various traditional crafts and art forms from different regions of India, including Phulkari and Bagh embroidery from Punjab, Chambal Rumal embroidery from Himachal Pradesh, Warli paintings from Maharashtra, Madhubani paintings from Bihar, Lac ware from Rajasthan, Kundan Jadau jewelry technique from Rajasthan and Gujarat, Tangka paintings from Tibet, Kashidakari embroidery from Kashmir, Blue Pottery from Rajasthan, Gota Patti metal embroidery from Rajasthan, Pattu weaving from Rajasthan, Phad painting scrolls from Rajasthan, Block printing techniques from Bagru and Sanganer near Ja
Sleeveless dresses were very popular throughout the 1950s and remained fashionable. Fashion in the 1960s saw major changes with mini skirts and go-go boots becoming trends. Notable trends of the 1960s like bell bottoms and miniskirts paved the way for modern fashion. The 1970s were characterized by bold prints, bright colors, and wild styles influencing both men's and women's fashion. Shoulder pads and neon colors defined the 1980s alongside other trends like leg warmers, chunky jewelry, and hot pants. The 1990s saw a mixture of styles like grunge fashion with cargo pants and hip-hugging jeans becoming popular. Early 2000s trends included trucker hats, sweat
India has 28 world heritage sites and 25 bio-geographic zones. The country’s big coastline provides a number of attractive beaches, diverse offerings such as adventure, rural and wildlife tourism.
India ranked 12th among 184 countries in terms of travel & tourism’s total contribution to Gross Domestic Product (GDP) in 2012. The sector’s direct contribution to GDP totalled US$ 34.7 billion in 2012 and is expected to grow to US$ 40.8 billion in 2013. Over 2013–23, the direct contribution is expected to register a growth of 7.8 per cent per annum.
Over 6.6 million foreign tourist arrivals (FTAs) were reported in 2012, expanding at compounded annual growth rate (CAGR) of 7.8 per cent during 2005-12. The total foreign exchange earnings (FEEs) from tourism grew over US$ 17.7 billion in 2012, registering a CAGR of 13.1 per cent during 2005-12. In February 2013, FEEs increased by 11.4 per cent to reach US$ 3.4 billion from US$ 3.1 billion in the same period in 2012.
Strong growth in per capita income in the country is driving the domestic tourism market. A shift in demographics with rising young population (coupled with changing lifestyles) is leading to greater expenditure on leisure services. The tourism policy of Government of India (GOI) aims at speedy implementation of tourism projects, development of integrated tourism circuits, special capacity building in the hospitality sector and new marketing strategies. In the hotel and tourism sector, the government has also allowed 100 per cent foreign direct investment (FDI) through automatic route.
Gond paintings originate from the Gond tribe in Madhya Pradesh and have a history of over 1400 years. The Gond people believed that depicting images of the natural world like hills, rivers, and trees showed respect for the spirits that inhabited them. They decorated the walls and floors of their homes with traditional motifs reflecting their close connection to the environment and depicting scenes of daily life. Bright colors like red, blue, yellow, and white are used, with pigments derived from natural sources like plants, leaves, sand, and cow dung. Lines and dots are added to the paintings to convey a sense of movement.
The document discusses costumes and fashion in ancient Mesopotamia. Mesopotamians developed impressive skills in fashioning clothing using wool and flax. Evidence of their clothing and accessories can be seen in sculptures, tablets, pottery, and the intricate costumes created for the film "Intolerance." They placed great importance on fashion and used it to display status. Facial hair and elaborate headgear were common, and they decorated their clothing with motifs of flowers, gold, and beadwork. Mesopotamians were considered one of the most advanced civilizations of their time in their passion for fashion.
Patwa is a thread craft which originated in Rajasthan and is now practiced in parts of Rajasthan, Maharashtra, UP. The word Patwa has been derived from the Hindi word ‘pat’ meaning silk and those involved in the silk and cotton thread business are called Patwas. The Patwa are a mainly Hindu community. Traditionally they were weavers and engaged in jewelry-making business and worked with silver and golden threads. Nadas, Parandi, tassels, pironas of necklaces and payals and rakhis are all examples of the craft.
The document discusses tourism in Pakistan. It notes that Pakistan has many beautiful places that could attract tourists, but that tourism has declined due to factors like terrorism, attacks on hotels and tourists, and a perception of Pakistan being dangerous. It outlines various regions in Pakistan with tourism potential, like Neelum Valley, Hunza Valley, and Kaghan Valley. However, a lack of infrastructure, high prices, and security issues have hindered tourism growth. Steps are needed to promote tourism through improved security, guidance, publicity, subsidies, and clean environments.
the ppt covers detailed information on the costumes of east asia covering countries like Japan, China, Koreas, Bhutan. this talks about the history of clothing in these areas during the very first civilizations.
Kerala is a state in southwest India known for its artisanal handicrafts that showcase centuries-old traditions. Some of the major handicrafts produced in Kerala include brass and bell metal works, coir and cane products, ivory works, lacquer ware, sandalwood carving, textiles, and wooden toys. Specific crafts discussed in the document include bell metal crafts made using the lost wax technique, coir products from Calicut and Kollam, intricately carved ivory sculptures, and painted lacquerware pots and toys. Kerala is also renowned for its handloom textiles, particularly traditional saris such as the plain white Karaikudi sari and the Balar
The ancient Egyptians placed great importance on hygiene, grooming, and dress. They bathed regularly, shaved their bodies, and used perfumes and cosmetics. Clothing was made of linen, and both men and women wore wigs, jewelry, and makeup. Children went without clothing until around age 12. The Egyptians enjoyed leisure activities like music, dancing, games, and sports. Their architecture such as the pyramids and temples were precisely built for religious and political purposes. They had an elaborate set of burial customs involving mummification and placing goods in tombs to ensure immortality in the afterlife.
Pakistani fashion has evolved significantly over the past several decades from a few pioneering designers to a large prolific industry. In the 1960s, most women's clothing was not produced through an organized industry, though Sughra Kazmi was one of the first recognized couturiers. Maheen Khan established herself as a seamstress in the late 1960s in Lahore and later became a renowned designer. TeeJays emerged in the 1970s as Pakistan's biggest fashion brand by promoting their clothing on popular TV shows. The 1980s saw more designers and labels emerge as fashion became more liberated from conservative influences. In the 2000s, the graduation of young designers from fashion schools helped professionalize and market Pakistani fashion globally.
The document is a project proposal that aims to advance the creativity and adaptability of basket weavers through workshops teaching innovative techniques using locally available materials. The primary objectives are to preserve cultural heritage, foster innovation among weavers, and create sustainable livelihoods for communities. Key activities include collaborating with experts to design workshops, conducting training, promoting local materials, and providing market access and monitoring of the project's success.
IRJET-Computer Aided Touchless Palmprint Recognition Using SiftIRJET Journal
This document discusses a computer aided touchless palmprint recognition system using Scale Invariant Feature Transform (SIFT). SIFT is used to extract features from touchless palmprint images that are invariant to changes in scale, rotation, and translation. The system involves preprocessing images, extracting SIFT features, and matching features to recognize and authenticate individuals. An experiment was conducted using 16 real palmprint images with varying conditions. The system achieved 93.75% accuracy in recognition using SIFT features, demonstrating its effectiveness for touchless palmprint recognition compared to other approaches. Future work could explore using color information and developing algorithms to handle variations like cosmetics or injuries.
Hand gesture recognition using support vector machinetheijes
1) The document describes a system for hand gesture recognition using support vector machines. It uses Canny's edge detection algorithm and histogram of gradients (HOG) for feature extraction from input images of hand gestures.
2) The system is trained using a dataset of predefined hand gestures. During testing, it compares the features extracted from new input images to those in the training dataset and classifies the gesture using an SVM classifier.
3) Experimental results found the system could accurately recognize 20 different static hand gestures in complex backgrounds. However, the authors note that future work could focus on real-time gesture recognition and reducing complexity for faster processing.
The document provides an overview of the ancient Sumerian civilization that emerged in Mesopotamia between the Tigris and Euphrates Rivers. It discusses that during the Neolithic period, settlements developed along the rivers which eventually evolved into complex cities. This led to significant cultural advancements for the Sumerians, including innovations like the wheel, plow, and irrigation canals. The Sumerians were also the first to develop a system of writing known as cuneiform. Sumerian society was composed of independent city-states each ruled by a priest-king and organized hierarchically. The Sumerians practiced a polytheistic religion and constructed massive ziggarat temples to appease the gods they believed
This document provides an introduction to cultural heritage. It defines heritage as anything important passed to future generations. Heritage is divided into natural heritage like landscapes and cultural heritage like traditions. Cultural heritage includes tangible sites and monuments as well as intangible aspects like folklore. Understanding a site's historical, social, aesthetic, and scientific significance helps determine management policies. Cultural identity is nurtured by a country's cultural heritage through understanding tangible sites and cultural behaviors, values, and traditions.
The Ashavali sari originated in Ahmedabad, Gujarat, which was formerly known as Ashaval and was an important textile manufacturing center. The Ashavali sari is known for its rich brocaded patterns woven in twill weave, with intricately woven silken patterns embedded into a gold surface in varied colors, imitating enamel work. Common motifs featured in Ashavali sari borders include parrots, peacocks, lions, doves, trees and flowers. Historical texts refer to Ahmedabad as a famous center for brocades and the Ashavali sari as being highly demanded. The local weaving technique was called desi vanat and produced sar
An Indian traditional craft Phulkari practiced in Punjab from generations. People from all over the love this hand crafted skill by the women in punjab
PRESENTATION ON BALUCHARI SAREE OF WEST BENGALAARTI WADHWA
THERE IS THIS SAYING ABOUT BALUCHARI SAREE "Makur tane kabbyo gaatha baluchari juri kotha";MEANING: weaving poetry and lore with the shuttle, the baluchari is beyond compare. (AND I COMPLETELY AGREE).
The Baluchari Sari has also been granted the status of Geographical indication in India.
Baluchari sarees are preferred for their soft and luxurious hand feel, the richness of the silks used, their fine weave and stylish looks.
The Baluchari sari has won the Presidential award on two occasions for its weaving style and has been prominently displayed in international trade fairs.
HOPE MY SLIDE WILL HELP YOU UNDERSTAND THIS PARTICULAR TEXTILE OF INDIA
This document provides an overview of Gujarat's culture, including its festivals, cuisine, crafts, textiles, costumes, and jewelry. Some key points:
- Gujarat is known for folk dances like garba and raas that are performed during Navratri festivals. Traditional Gujarati food is primarily vegetarian and healthy.
- The state has rich textile traditions like Patola silk sarees, Bandhani tie-dye, embroidery styles like Banni and Kutchi that use mirrors and threadwork.
- Traditional Gujarati costumes vary by region and community but include items like chaniya cholis for women and kediyus for men. Bridal wear
This document provides summaries of various traditional crafts and art forms from different regions of India, including Phulkari and Bagh embroidery from Punjab, Chambal Rumal embroidery from Himachal Pradesh, Warli paintings from Maharashtra, Madhubani paintings from Bihar, Lac ware from Rajasthan, Kundan Jadau jewelry technique from Rajasthan and Gujarat, Tangka paintings from Tibet, Kashidakari embroidery from Kashmir, Blue Pottery from Rajasthan, Gota Patti metal embroidery from Rajasthan, Pattu weaving from Rajasthan, Phad painting scrolls from Rajasthan, Block printing techniques from Bagru and Sanganer near Ja
Sleeveless dresses were very popular throughout the 1950s and remained fashionable. Fashion in the 1960s saw major changes with mini skirts and go-go boots becoming trends. Notable trends of the 1960s like bell bottoms and miniskirts paved the way for modern fashion. The 1970s were characterized by bold prints, bright colors, and wild styles influencing both men's and women's fashion. Shoulder pads and neon colors defined the 1980s alongside other trends like leg warmers, chunky jewelry, and hot pants. The 1990s saw a mixture of styles like grunge fashion with cargo pants and hip-hugging jeans becoming popular. Early 2000s trends included trucker hats, sweat
India has 28 world heritage sites and 25 bio-geographic zones. The country’s big coastline provides a number of attractive beaches, diverse offerings such as adventure, rural and wildlife tourism.
India ranked 12th among 184 countries in terms of travel & tourism’s total contribution to Gross Domestic Product (GDP) in 2012. The sector’s direct contribution to GDP totalled US$ 34.7 billion in 2012 and is expected to grow to US$ 40.8 billion in 2013. Over 2013–23, the direct contribution is expected to register a growth of 7.8 per cent per annum.
Over 6.6 million foreign tourist arrivals (FTAs) were reported in 2012, expanding at compounded annual growth rate (CAGR) of 7.8 per cent during 2005-12. The total foreign exchange earnings (FEEs) from tourism grew over US$ 17.7 billion in 2012, registering a CAGR of 13.1 per cent during 2005-12. In February 2013, FEEs increased by 11.4 per cent to reach US$ 3.4 billion from US$ 3.1 billion in the same period in 2012.
Strong growth in per capita income in the country is driving the domestic tourism market. A shift in demographics with rising young population (coupled with changing lifestyles) is leading to greater expenditure on leisure services. The tourism policy of Government of India (GOI) aims at speedy implementation of tourism projects, development of integrated tourism circuits, special capacity building in the hospitality sector and new marketing strategies. In the hotel and tourism sector, the government has also allowed 100 per cent foreign direct investment (FDI) through automatic route.
Gond paintings originate from the Gond tribe in Madhya Pradesh and have a history of over 1400 years. The Gond people believed that depicting images of the natural world like hills, rivers, and trees showed respect for the spirits that inhabited them. They decorated the walls and floors of their homes with traditional motifs reflecting their close connection to the environment and depicting scenes of daily life. Bright colors like red, blue, yellow, and white are used, with pigments derived from natural sources like plants, leaves, sand, and cow dung. Lines and dots are added to the paintings to convey a sense of movement.
The document discusses costumes and fashion in ancient Mesopotamia. Mesopotamians developed impressive skills in fashioning clothing using wool and flax. Evidence of their clothing and accessories can be seen in sculptures, tablets, pottery, and the intricate costumes created for the film "Intolerance." They placed great importance on fashion and used it to display status. Facial hair and elaborate headgear were common, and they decorated their clothing with motifs of flowers, gold, and beadwork. Mesopotamians were considered one of the most advanced civilizations of their time in their passion for fashion.
Patwa is a thread craft which originated in Rajasthan and is now practiced in parts of Rajasthan, Maharashtra, UP. The word Patwa has been derived from the Hindi word ‘pat’ meaning silk and those involved in the silk and cotton thread business are called Patwas. The Patwa are a mainly Hindu community. Traditionally they were weavers and engaged in jewelry-making business and worked with silver and golden threads. Nadas, Parandi, tassels, pironas of necklaces and payals and rakhis are all examples of the craft.
The document discusses tourism in Pakistan. It notes that Pakistan has many beautiful places that could attract tourists, but that tourism has declined due to factors like terrorism, attacks on hotels and tourists, and a perception of Pakistan being dangerous. It outlines various regions in Pakistan with tourism potential, like Neelum Valley, Hunza Valley, and Kaghan Valley. However, a lack of infrastructure, high prices, and security issues have hindered tourism growth. Steps are needed to promote tourism through improved security, guidance, publicity, subsidies, and clean environments.
the ppt covers detailed information on the costumes of east asia covering countries like Japan, China, Koreas, Bhutan. this talks about the history of clothing in these areas during the very first civilizations.
Kerala is a state in southwest India known for its artisanal handicrafts that showcase centuries-old traditions. Some of the major handicrafts produced in Kerala include brass and bell metal works, coir and cane products, ivory works, lacquer ware, sandalwood carving, textiles, and wooden toys. Specific crafts discussed in the document include bell metal crafts made using the lost wax technique, coir products from Calicut and Kollam, intricately carved ivory sculptures, and painted lacquerware pots and toys. Kerala is also renowned for its handloom textiles, particularly traditional saris such as the plain white Karaikudi sari and the Balar
The ancient Egyptians placed great importance on hygiene, grooming, and dress. They bathed regularly, shaved their bodies, and used perfumes and cosmetics. Clothing was made of linen, and both men and women wore wigs, jewelry, and makeup. Children went without clothing until around age 12. The Egyptians enjoyed leisure activities like music, dancing, games, and sports. Their architecture such as the pyramids and temples were precisely built for religious and political purposes. They had an elaborate set of burial customs involving mummification and placing goods in tombs to ensure immortality in the afterlife.
Pakistani fashion has evolved significantly over the past several decades from a few pioneering designers to a large prolific industry. In the 1960s, most women's clothing was not produced through an organized industry, though Sughra Kazmi was one of the first recognized couturiers. Maheen Khan established herself as a seamstress in the late 1960s in Lahore and later became a renowned designer. TeeJays emerged in the 1970s as Pakistan's biggest fashion brand by promoting their clothing on popular TV shows. The 1980s saw more designers and labels emerge as fashion became more liberated from conservative influences. In the 2000s, the graduation of young designers from fashion schools helped professionalize and market Pakistani fashion globally.
The document is a project proposal that aims to advance the creativity and adaptability of basket weavers through workshops teaching innovative techniques using locally available materials. The primary objectives are to preserve cultural heritage, foster innovation among weavers, and create sustainable livelihoods for communities. Key activities include collaborating with experts to design workshops, conducting training, promoting local materials, and providing market access and monitoring of the project's success.
IRJET-Computer Aided Touchless Palmprint Recognition Using SiftIRJET Journal
This document discusses a computer aided touchless palmprint recognition system using Scale Invariant Feature Transform (SIFT). SIFT is used to extract features from touchless palmprint images that are invariant to changes in scale, rotation, and translation. The system involves preprocessing images, extracting SIFT features, and matching features to recognize and authenticate individuals. An experiment was conducted using 16 real palmprint images with varying conditions. The system achieved 93.75% accuracy in recognition using SIFT features, demonstrating its effectiveness for touchless palmprint recognition compared to other approaches. Future work could explore using color information and developing algorithms to handle variations like cosmetics or injuries.
Hand gesture recognition using support vector machinetheijes
1) The document describes a system for hand gesture recognition using support vector machines. It uses Canny's edge detection algorithm and histogram of gradients (HOG) for feature extraction from input images of hand gestures.
2) The system is trained using a dataset of predefined hand gestures. During testing, it compares the features extracted from new input images to those in the training dataset and classifies the gesture using an SVM classifier.
3) Experimental results found the system could accurately recognize 20 different static hand gestures in complex backgrounds. However, the authors note that future work could focus on real-time gesture recognition and reducing complexity for faster processing.
Image Features Matching and Classification Using Machine LearningIRJET Journal
This document presents a research paper that proposes a new methodology for image feature matching and classification using machine learning. The paper aims to improve accuracy and robustness in feature extraction and matching between digital images. The proposed methodology extracts features from images using machine learning, matches common features between images, and classifies objects. It is evaluated based on precision, recall, and F1-score, and shows improved performance over traditional Scale Invariant Feature Transform (SIFT) techniques on tested datasets with different objects. The proposed approach extracts fewer features and takes less computation time than traditional methods.
IRJET - Deep Learning Approach to Inpainting and Outpainting SystemIRJET Journal
This document discusses a deep learning approach for image inpainting and outpainting. It proposes a new generative model-based approach using a fully convolutional neural network that can process images with multiple holes at variable locations and sizes. The model aims to not only synthesize novel image structures, but also explicitly utilize surrounding image features as references during training to generate better predictions. Experiments on faces, textures and natural images demonstrate the proposed approach generates higher quality inpainting results than existing methods. It aims to address limitations of CNNs in borrowing information from distant areas by leveraging texture and patch synthesis approaches.
1. The document summarizes a student project that aims to create a virtual try-on application using augmented reality. It surveys existing methods for tasks like clothing segmentation, human pose estimation, and virtual try-on that could be used to build the application.
2. It discusses approaches the students investigated like using depth cameras for measurements, non-depth based methods using computer vision, parsing clothes and humans, and existing work on 2D virtual try-on.
3. The students implemented initial modules for their pipeline including a U-Net for clothing segmentation trained on images and masks from the Viton dataset.
This document discusses modeling techniques for virtual environments, including geometric modeling, behavior modeling, and model management. Geometric modeling describes how to represent the shape and appearance of virtual objects. Behavior modeling deals with modeling the interactions and intelligent behavior of virtual objects, including using sensors and defining levels of autonomy. Model management techniques are needed to efficiently handle large complex virtual worlds with many objects due to the heavy computation and memory usage required.
IRJET - A Survey Paper on Efficient Object Detection and Matching using F...IRJET Journal
This document summarizes an approach for efficient object detection and matching in images and videos. It proposes a classification scheme that classifies extracted features as either object or non-object features. This binary classification approach can be used for object detection and matching in a way that is more robust and faster compared to traditional methods. The classification stage also enables faster object registration. The approach is evaluated to show advantages for object matching and registration compared to other methods. It has potential applications for real-time object tracking and detection.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
An interactive image segmentation using multiple user inputªseSAT Journals
Abstract In this paper, we consider the Interactive image segmentation with multiple user inputs. The proposed system is the use of multiple intuitive user inputs to better reflect the user’s intention. The use of multiple types of intuitive inputs provides the user’s intention under different scenario. The proposed method is developed as a combined segmentation and editing tool. It incorporates a simple user interface and a fast and reliable segmentation based on 1D segment matching. The user is required to click just a few "control points" on the desired object border, and let the algorithm complete the rest. The user can then edit the result by adding, removing and moving control points, where each interaction follows by an automatic, real-time segmentation by the algorithm. Interactive image segmentation involves a proposed algorithm, Constrained Random walks algorithm. The Constrained Random Walks algorithm facilitates the use of three types of user inputs. 1. Foreground and Background seed input 2. Soft Constraint input 3. Hard Constraint input. The effectiveness of the proposed method is validated by experimental results. The proposed algorithm is algorithmically simple, efficient and less time consuming. Keywords: Interactive image segmentation, Interactive image segmentation, digital image editing, multiple user inputs, random walks algorithm.
MAGI is a software infrastructure for developing and executing multi-agent geosimulation (MAG) models. It provides libraries, tools and a graphical user interface to design, build, run and analyze MAG simulations. MAGI models represent real-world geographic phenomena using agents that interact spatially within a GIS environment. It supports various agent and spatial representations, scheduling approaches, and interoperability with GIS software.
Implementation of Object Tracking for Real Time VideoIDES Editor
Real-time tracking of object boundaries is an
important task in many vision applications. Here we propose
an approach to implement the level set method. This approach
does not need to solve any partial differential equations (PDFs),
thus reducing the computation dramatically compared with
optimized narrow band techniques proposed before. With our
approach, real-time level-set based video tracking can be
achieved.
This document describes a method for interactively manipulating parametric 3D shapes through direct brush strokes on the shape itself, rather than through adjusting individual hyper-parameters. The method works by amending the underlying direct acyclic graph (DAG) of the parametric shape to enable local differentiation of the shape with respect to its hyper-parameters. This allows the user's brush strokes to be interpreted as changes to the hyper-parameters through inverse control, without requiring any additional setup by the shape designer. The method is automatic, flexible, and non-invasive to existing parametric shape engines.
Semi-Supervised Method of Multiple Object Segmentation with a Region Labeling...sipij
Efficient and efficient multiple object segmentation is an important task in computer vision and object recognition. In this work; we address a method to effectively discover a user’s concept when multiple objects of interest are involved in content based image retrieval. The proposed method incorporate a framework for multiple object retrieval using semi-supervised method of similar region merging and flood fill which models the spatial and appearance relations among image pixels. To improve the effectiveness of similarity based region merging we propose a new similarity based object retrieval. The users only need to roughly indicate the after which steps desired objects contour is obtained during the automatic merging of similar regions. A novel similarity based region merging mechanism is proposed to guide the merging process with the help of mean shift technique and objects detection using region labeling and flood fill. A region R is merged with its adjacent regions Q if Q has highest similarity with Q (using Bhattacharyya descriptor) among all Q’s adjacent regions. The proposed method automatically merges the regions that are initially segmented through mean shift technique, and then effectively extracts the object contour by merging all similar regions. Extensive experiments are performed on 12 object classes (224 images total) show promising results.
Partial Object Detection in Inclined Weather ConditionsIRJET Journal
This document provides a comprehensive analysis of imbalance problems in object detection. It presents a taxonomy to classify different types of imbalances and discusses solutions proposed in literature. The analysis highlights significant gaps including existing imbalances that require further attention, as well as entirely new imbalances that have never been addressed before. A survey of imbalance problems caused by weather conditions and common object imbalances is conducted. Methods for addressing imbalances include data augmentation using GANs and balancing training based on class performance.
This document provides an overview of computer graphics. It discusses interactive graphics where the user has control over the image and passive graphics where the image is produced automatically. Interactive graphics allow for advantages like more efficient communication and understanding of data through dynamic and user-controlled visualization. The document also describes how an interactive graphics display works with components like a frame buffer and display controller that outputs images to a monitor.
This document provides an overview of computer graphics and its applications. It discusses interactive graphics, where the user can control the image, versus passive graphics which produce images automatically. Interactive graphics allow for advantages like motion dynamics and update dynamics. The document then covers how interactive graphics displays work, using a frame buffer, monitor, and display controller. It concludes with a discussion of various applications of computer graphics, such as cartography, user interfaces, scientific visualization, CAD/CAM, simulation, art, process control and more.
Cartoonization of images using machine LearningIRJET Journal
The document presents a method for cartoonization of images using machine learning. It discusses converting real-world photos into cartoon images using a GAN-based approach. The key steps include:
1. Importing required modules like OpenCV, NumPy for image processing and GAN modeling.
2. Pre-processing input images by converting them to grayscale, smoothing, and edge detection.
3. Training a GAN using cartoon and photo images to generate new cartoon images.
4. For video cartoonization, frames are extracted from videos using OpenCV, individually cartoonized using the GAN, and reconstructed into a cartoon video.
The proposed system is able to convert images and videos to cartoon style in real-time using deep learning
International Journal of Computational Engineering Research(IJCER)ijceronline
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology
Image Segmentation from RGBD Images by 3D Point Cloud Attributes and High-Lev...CSCJournals
The document describes an image segmentation algorithm that uses both color and depth features extracted from RGBD images captured by a Kinect sensor. The algorithm clusters pixels into segments based on their color, texture, 3D spatial coordinates, surface normals, and the output of a graph-based segmentation algorithm. Depth features help resolve illumination issues and occlusion that cannot be handled by color-only methods. The algorithm was tested on commercial building images and showed potential for real-time applications.
Face Recognition Based on Image Processing in an Advanced Robotic SystemIRJET Journal
This document describes a face recognition system used to control a robotic system. The system works in two stages: first, face recognition is used to unlock the system by validating a user's face. Then, different navigation images are used to control the robot's motion. Face recognition is implemented using support vector machine (SVM), histogram of oriented gradients (HOG), and k-nearest neighbors (KNN) algorithms in MATLAB. The process is based on machine learning concepts where the system is trained in a supervised manner to recognize faces and control the robot.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframePrecisely
Inconsistent user experience and siloed data, high costs, and changing customer expectations – Citizens Bank was experiencing these challenges while it was attempting to deliver a superior digital banking experience for its clients. Its core banking applications run on the mainframe and Citizens was using legacy utilities to get the critical mainframe data to feed customer-facing channels, like call centers, web, and mobile. Ultimately, this led to higher operating costs (MIPS), delayed response times, and longer time to market.
Ever-changing customer expectations demand more modern digital experiences, and the bank needed to find a solution that could provide real-time data to its customer channels with low latency and operating costs. Join this session to learn how Citizens is leveraging Precisely to replicate mainframe data to its customer channels and deliver on their “modern digital bank” experiences.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
2. SIGGRAPH ’23 Conference Proceedings, August 6–10, 2023, Los Angeles, CA, USA X. Pan, A. Tewari, T. Leimkühler, L. Liu, A. Meka, C. Theobalt
1 INTRODUCTION
Deep generative models such as generative adversarial networks
(GANs) [Goodfellow et al. 2014] have achieved unprecedented suc-
cess in synthesizing random photorealistic images. In real-world
applications, a critical functionality requirement of such learning-
based image synthesis methods is the controllability over the syn-
thesized visual content. For example, social-media users might want
to adjust the position, shape, expression, and body pose of a hu-
man or animal in a casually-captured photo; professional movie
pre-visualization and media editing may require efficiently creating
sketches of scenes with certain layouts; and car designers may want
to interactively modify the shape of their creations. To satisfy these
diverse user requirements, an ideal controllable image synthesis
approach should possess the following properties 1) Flexibility: it
should be able to control different spatial attributes including posi-
tion, pose, shape, expression, and layout of the generated objects
or animals; 2) Precision: it should be able to control the spatial at-
tributes with high precision; 3) Generality: it should be applicable
to different object categories but not limited to a certain category.
While previous works only satisfy one or two of these properties,
we target to achieve them all in this work.
Most previous approaches gain controllability of GANs via prior
3D models [Deng et al. 2020; Ghosh et al. 2020; Tewari et al. 2020] or
supervised learning that relies on manually annotated data [Abdal
et al. 2021; Isola et al. 2017; Ling et al. 2021; Park et al. 2019; Shen
et al. 2020]. Thus, these approaches fail to generalize to new object
categories, often control a limited range of spatial attributes or pro-
vide little control over the editing process. Recently, text-guided
image synthesis has attracted attention [Ramesh et al. 2022; Rom-
bach et al. 2021; Saharia et al. 2022]. However, text guidance lacks
precision and flexibility in terms of editing spatial attributes. For
example, it cannot be used to move an object by a specific number
of pixels.
To achieve flexible, precise, and generic controllability of GANs,
in this work, we explore a powerful yet much less explored interac-
tive point-based manipulation. Specifically, we allow users to click
any number of handle points and target points on the image and
the goal is to drive the handle points to reach their corresponding
target points. As shown in Fig. 1, this point-based manipulation
allows users to control diverse spatial attributes and is agnostic to
object categories. The approach with the closest setting to ours is
UserControllableLT [Endo 2022], which also studies dragging-based
manipulation. Compared to it, the problem studied in this paper
has two more challenges: 1) we consider the control of more than
one point, which their approach does not handle well; 2) we require
the handle points to precisely reach the target points while their
approach does not. As we will show in experiments, handling more
than one point with precise position control enables much more
diverse and accurate image manipulation.
To achieve such interactive point-based manipulation, we pro-
pose DragGAN, which addresses two sub-problems, including 1)
supervising the handle points to move towards the targets and 2)
tracking the handle points so that their positions are known at
each editing step. Our technique is built on the key insight that
the feature space of a GAN is sufficiently discriminative to enable
both motion supervision and precise point tracking. Specifically, the
motion supervision is achieved via a shifted feature patch loss that
optimizes the latent code. Each optimization step leads to the handle
points shifting closer to the targets; thus point tracking is then per-
formed through nearest neighbor search in the feature space. This
optimization process is repeated until the handle points reach the
targets. DragGAN also allows users to optionally draw a region of
interest to perform region-specific editing. Since DragGAN does not
rely on any additional networks like RAFT [Teed and Deng 2020],
it achieves efficient manipulation, only taking a few seconds on a
single RTX 3090 GPU in most cases. This allows for live, interactive
editing sessions, in which the user can quickly iterate on different
layouts till the desired output is achieved.
We conduct an extensive evaluation of DragGAN on diverse
datasets including animals (lions, dogs, cats, and horses), humans
(face and whole body), cars, and landscapes. As shown in Fig.1,
our approach effectively moves the user-defined handle points to
the target points, achieving diverse manipulation effects across
many object categories. Unlike conventional shape deformation
approaches that simply apply warping [Igarashi et al. 2005], our
deformation is performed on the learned image manifold of a GAN,
which tends to obey the underlying object structures. For example,
our approach can hallucinate occluded content, like the teeth inside
a lion’s mouth, and can deform following the object’s rigidity, like
the bending of a horse leg. We also develop a GUI for users to
interactively perform the manipulation by simply clicking on the
image. Both qualitative and quantitative comparison confirms the
advantage of our approach over UserControllableLT. Furthermore,
our GAN-based point tracking algorithm also outperforms existing
point tracking approaches such as RAFT [Teed and Deng 2020] and
PIPs [Harley et al. 2022] for GAN-generated frames. Furthermore,
by combining with GAN inversion techniques, our approach also
serves as a powerful tool for real image editing.
2 RELATED WORK
2.1 Generative Models for Interactive Content Creation
Most current methods use generative adversarial networks (GANs)
or diffusion models for controllable image synthesis.
Unconditional GANs. GANs are generative models that transform
low-dimensional randomly sampled latent vectors into photorealis-
tic images. They are trained using adversarial learning and can be
used to generate high-resolution photorealistic images [Creswell
et al. 2018; Goodfellow et al. 2014; Karras et al. 2021, 2019]. Most
GAN models like StyleGAN [Karras et al. 2019] do not directly
enable controllable editing of the generated images.
Conditional GANs. Several methods have proposed conditional
GANs to address this limitation. Here, the network receives a con-
ditional input, such as segmentation map [Isola et al. 2017; Park
et al. 2019] or 3D variables [Deng et al. 2020; Ghosh et al. 2020], in
addition to the randomly sampled latent vector to generate photo-
realistic images. Instead of modeling the conditional distribution,
EditGAN [Ling et al. 2021] enables editing by first modeling a joint
distribution of images and segmentation maps, and then computing
new images corresponding to edited segmentation maps.
2
3. Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold SIGGRAPH ’23 Conference Proceedings, August 6–10, 2023, Los Angeles, CA, USA
Controllability using Unconditional GANs. Several methods have
been proposed for editing unconditional GANs by manipulating the
input latent vectors. Some approaches find meaningful latent direc-
tions via supervised learning from manual annotations or prior 3D
models [Abdal et al. 2021; Leimkühler and Drettakis 2021; Patashnik
et al. 2021; Shen et al. 2020; Tewari et al. 2020]. Other approaches
compute the important semantic directions in the latent space in
an unsupervised manner [Härkönen et al. 2020; Shen and Zhou
2020; Zhu et al. 2023]. Recently, the controllability of coarse object
position is achieved by introducing intermediate “blobs" [Epstein
et al. 2022] or heatmaps [Wang et al. 2022b]. All of these approaches
enable editing of either image-aligned semantic attributes such as
appearance, or coarse geometric attributes such as object position
and pose. While Editing-in-Style [Collins et al. 2020] showcases
some spatial attributes editing capability, it can only achieve this by
transferring local semantics between different samples. In contrast
to these methods, our approach allows users to perform fine-grained
control over the spatial attributes using point-based editing.
GANWarping [Wang et al. 2022a] also use point-based editing,
however, they only enable out-of-distribution image editing. A few
warped images can be used to update the generative model such
that all generated images demonstrate similar warps. However, this
method does not ensure that the warps lead to realistic images.
Further, it does not enable controls such as changing the 3D pose
of the object. Similar to us, UserControllableLT [Endo 2022] en-
ables point-based editing by transforming latent vectors of a GAN.
However, this approach only supports editing using a single point
being dragged on the image and does not handle multiple-point
constraints well. In addition, the control is not precise, i.e., after
editing, the target point is often not reached.
3D-aware GANs. Several methods modify the architecture of the
GAN to enable 3D control [Chan et al. 2022, 2021; Chen et al. 2022;
Gu et al. 2022; Pan et al. 2021; Schwarz et al. 2020; Tewari et al.
2022; Xu et al. 2022]. Here, the model generates 3D representations
that can be rendered using a physically-based analytic renderer.
However, unlike our approach, control is limited to global pose or
lighting.
Diffusion Models. More recently, diffusion models [Sohl-Dickstein
et al. 2015] have enabled image synthesis at high quality [Ho et al.
2020; Song et al. 2020, 2021]. These models iteratively denoise a
randomly sampled noise to create a photorealistic image. Recent
models have shown expressive image synthesis conditioned on text
inputs [Ramesh et al. 2022; Rombach et al. 2021; Saharia et al. 2022].
However, natural language does not enable fine-grained control
over the spatial attributes of images, and thus, all text-conditional
methods are restricted to high-level semantic editing. In addition,
current diffusion models are slow since they require multiple denois-
ing steps. While progress has been made toward efficient sampling,
GANs are still significantly more efficient.
2.2 Point Tracking
To track points in videos, an obvious approach is through optical
flow estimation between consecutive frames. Optical flow estimation
is a classic problem that estimates motion fields between two images.
Conventional approaches solve optimization problems with hand-
crafted criteria [Brox and Malik 2010; Sundaram et al. 2010], while
deep learning-based approaches started to dominate the field in
recent years due to better performance [Dosovitskiy et al. 2015;
Ilg et al. 2017; Teed and Deng 2020]. These deep learning-based
approaches typically use synthetic data with ground truth optical
flow to train the deep neural networks. Among them, the most
widely used method now is RAFT [Teed and Deng 2020], which
estimates optical flow via an iterative algorithm. Recently, Harley
et al. [2022] combines this iterative algorithm with a conventional
“particle video” approach, giving rise to a new point tracking method
named PIPs. PIPs considers information across multiple frames and
thus handles long-range tracking better than previous approaches.
In this work, we show that point tracking on GAN-generated
images can be performed without using any of the aforementioned
approaches or additional neural networks. We reveal that the fea-
ture spaces of GANs are discriminative enough such that tracking
can be achieved simply via feature matching. While some previous
works also leverage the discriminative feature in semantic segmen-
tation [Tritrong et al. 2021; Zhang et al. 2021], we are the first to
connect the point-based editing problem to the intuition of discrim-
inative GAN features and design a concrete method. Getting rid of
additional tracking models allows our approach to run much more
efficiently to support interactive editing. Despite the simplicity of
our approach, we show that it outperforms the state-of-the-art point
tracking approaches including RAFT and PIPs in our experiments.
3 METHOD
This work aims to develop an interactive image manipulation method
for GANs where users only need to click on the images to define
some pairs of (handle point, target point) and drive the handle points
to reach their corresponding target points. Our study is based on
the StyleGAN2 architecture [Karras et al. 2020]. Here we briefly
introduce the basics of this architecture.
StyleGAN Terminology. In the StyleGAN2 architecture, a 512 di-
mensional latent code 𝒛 ∈ N (0, 𝑰) is mapped to an intermediate
latent code 𝒘 ∈ R512 via a mapping network. The space of 𝒘 is com-
monly referred to as W.𝒘 is then sent to the generator𝐺 to produce
the output image I = 𝐺(𝒘). In this process, 𝒘 is copied several times
and sent to different layers of the generator 𝐺 to control different
levels of attributes. Alternatively, one can also use different 𝒘 for
different layers, in which case the input would be𝒘 ∈ R𝑙×512 = W+,
where 𝑙 is the number of layers. This less constrained W+ space is
shown to be more expressive [Abdal et al. 2019]. As the generator
𝐺 learns a mapping from a low-dimensional latent space to a much
higher dimensional image space, it can be seen as modelling an
image manifold [Zhu et al. 2016].
3.1 Interactive Point-based Manipulation
An overview of our image manipulation pipeline is shown in Fig. 2.
For any image I ∈ R3×𝐻×𝑊 generated by a GAN with latent code
𝒘, we allow the user to input a number of handle points {𝒑𝑖 =
(𝑥𝑝,𝑖,𝑦𝑝,𝑖)|𝑖 = 1, 2, ...,𝑛} and their corresponding target points {𝒕𝑖 =
(𝑥𝑡,𝑖,𝑦𝑡,𝑖)|𝑖 = 1, 2, ...,𝑛} (i.e., the corresponding target point of 𝒑𝑖
is 𝒕𝑖). The goal is to move the object in the image such that the
3
4. SIGGRAPH ’23 Conference Proceedings, August 6–10, 2023, Los Angeles, CA, USA X. Pan, A. Tewari, T. Leimkühler, L. Liu, A. Meka, C. Theobalt
Generator
Latent code w
Motion
supervision
w’
Point
tracking
Motion
supervision
w*
…
User input
Initial image 1st optimization step Update points Final image
Handle point
Target point
Fig. 2. Overview of our pipeline. Given a GAN-generated image, the user only needs to set several handle points (red dots), target points (blue dots), and
optionally a mask denoting the movable region during editing (brighter area). Our approach iteratively performs motion supervision (Sec. 3.2) and point tracking
(Sec. 3.3). The motion supervision step drives the handle points (red dots) to move towards the target points (blue dots) and the point tracking step updates
the handle points to track the object in the image. This process continues until the handle points reach their corresponding target points.
semantic positions (e.g., the nose and the jaw in Fig. 2) of the handle
points reach their corresponding target points. We also allow the
user to optionally draw a binary mask M denoting which region of
the image is movable.
Given these user inputs, we perform image manipulation in an
optimization manner. As shown in Fig. 2, each optimization step
consists of two sub-steps, including 1) motion supervision and 2)
point tracking. In motion supervision, a loss that enforces handle
points to move towards target points is used to optimize the latent
code 𝒘. After one optimization step, we get a new latent code 𝒘′
and a new image I′. The update would cause a slight movement
of the object in the image. Note that the motion supervision step
only moves each handle point towards its target by a small step but
the exact length of the step is unclear as it is subject to complex
optimization dynamics and therefore varies for different objects
and parts. Thus, we then update the positions of the handle points
{𝒑𝑖 } to track the corresponding points on the object. This tracking
process is necessary because if the handle points (e.g., nose of the
lion) are not accurately tracked, then in the next motion supervision
step, wrong points (e.g., face of the lion) will be supervised, leading
to undesired results. After tracking, we repeat the above optimiza-
tion step based on the new handle points and latent codes. This
optimization process continues until the handle points {𝒑𝑖 } reach
the position of the target points {𝒕𝑖 }, which usually takes 30-200
iterations in our experiments. The user can also stop the optimiza-
tion at any intermediate step. After editing, the user can input new
handle and target points and continue editing until satisfied with
the results.
3.2 Motion Supervision
How to supervise the point motion for a GAN-generated image has
not been much explored before. In this work, we propose a motion
supervision loss that does not rely on any additional neural net-
works. The key idea is that the intermediate features of the generator
are very discriminative such that a simple loss suffices to supervise
motion. Specifically, we consider the feature maps F after the 6th
block of StyleGAN2, which performs the best among all features due
to a good trade-off between resolution and discriminativeness. We
resize F to have the same resolution as the final image via bilinear
Feature
Generator
Latent code w w’
Nearest
Neighbor
L1_loss( , .detach())
Fig. 3. Method. Our motion supervision is achieved via a shifted patch loss
on the feature maps of the generator. We perform point tracking on the
same feature space via the nearest neighbor search.
interpolation. As shown in Fig. 3, to move a handle point 𝒑𝑖 to the
target point 𝒕𝑖, our idea is to supervise a small patch around 𝒑𝑖
(red circle) to move towards 𝒕𝑖 by a small step (blue circle). We use
Ω1(𝒑𝑖,𝑟1) to denote the pixels whose distance to 𝒑𝑖 is less than 𝑟1,
then our motion supervision loss is:
L =
𝑛
∑︁
𝑖=0
∑︁
𝒒𝑖 ∈Ω1 (𝒑𝑖,𝑟1)
∥F(𝒒𝑖) − F(𝒒𝑖 + 𝒅𝑖)∥1 + 𝜆∥(F − F0) · (1 − M)∥1,
(1)
where F(𝒒) denotes the feature values of F at pixel 𝒒, 𝒅𝑖 =
𝒕𝑖 −𝒑𝑖
∥𝒕𝑖 −𝒑𝑖 ∥2
is a normalized vector pointing from 𝒑𝑖 to 𝒕𝑖 (𝒅𝑖 = 0 if 𝒕𝑖 = 𝒑𝑖),
and F0 is the feature maps corresponding to the initial image. Note
that the first term is summed up over all handle points {𝒑𝑖 }. As the
components of 𝒒𝑖 +𝒅𝑖 are not integers, we obtain F(𝒒𝑖 +𝒅𝑖) via bilin-
ear interpolation. Importantly, when performing back-propagation
using this loss, the gradient is not back-propagated through F(𝒒𝑖).
This will motivate 𝒑𝑖 to move to 𝒑𝑖 + 𝒅𝑖 but not vice versa. In case
the binary mask M is given, we keep the unmasked region fixed with
a reconstruction loss shown as the second term. At each motion
supervision step, this loss is used to optimize the latent code 𝒘 for
one step. 𝒘 can be optimized either in the W space or in the W+
4
5. Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold SIGGRAPH ’23 Conference Proceedings, August 6–10, 2023, Los Angeles, CA, USA
Inputs
Ours
UserControllableLT
Fig. 4. Qualitative comparison of our approach to UserControllableLT [Endo 2022] on the task of moving handle points (red dots) to target points (blue dots).
Our approach achieves more natural and superior results on various datasets. More examples are provided in Fig. 10.
space, depending on whether the user wants a more constrained
image manifold or not. As W+ space is easier to achieve out-of-
distribution manipulations (e.g., cat in Fig. 16), we use W+ in this
work for better editability. In practice, we observe that the spatial
attributes of the image are mainly affected by the 𝒘 for the first
6 layers while the remaining ones only affect appearance. Thus,
inspired by the style-mixing technique [Karras et al. 2019], we only
update the 𝒘 for the first 6 layers while fixing others to preserve the
appearance. This selective optimization leads to the desired slight
movement of image content.
3.3 Point Tracking
The previous motion supervision results in a new latent code 𝒘′,
new feature maps F′, and a new image I′. As the motion supervision
step does not readily provide the precise new locations of the handle
points, our goal here is to update each handle point 𝒑𝑖 such that it
tracks the corresponding point on the object. Point tracking is typi-
cally performed via optical flow estimation models or particle video
approaches [Harley et al. 2022]. Again, these additional models can
significantly harm efficiency and may suffer from accumulation
error, especially in the presence of alias artifacts in GANs. We thus
present a new point tracking approach for GANs. The insight is that
the discriminative features of GANs well capture dense correspon-
dence and thus tracking can be effectively performed via nearest
neighbor search in a feature patch. Specifically, we denote the fea-
ture of the initial handle point as 𝒇𝑖 = F0(𝒑𝑖). We denote the patch
around 𝒑𝑖 as Ω2(𝒑𝑖,𝑟2) = {(𝑥,𝑦) | |𝑥 − 𝑥𝑝,𝑖 | < 𝑟2, |𝑦 − 𝑦𝑝,𝑖 | < 𝑟2}.
Then the tracked point is obtained by searching for the nearest
neighbor of 𝑓𝑖 in Ω2(𝒑𝑖,𝑟2):
𝒑𝑖 := arg min
𝒒𝑖 ∈Ω2 (𝒑𝑖,𝑟2)
∥F′
(𝒒𝑖) − 𝒇𝑖 ∥1. (2)
In this way, 𝒑𝑖 is updated to track the object. For more than one
handle point, we apply the same process for each point. Note that
here we are also considering the feature maps F′ after the 6th block
of StyleGAN2. The feature maps have a resolution of 256 × 256 and
are bilinear interpolated to the same size as the image if needed,
which is sufficient to perform accurate tracking in our experiments.
We analyze this choice at Sec. 4.2.
3.4 Implementation Details
We implement our approach based on PyTorch [Paszke et al. 2017].
We use the Adam optimizer [Kingma and Ba 2014] to optimize
the latent code 𝒘 with a step size of 2e-3 for FFHQ [Karras et al.
2019], AFHQCat [Choi et al. 2020], and LSUN Car [Yu et al. 2015]
datasets and 1e-3 for others. The hyper-parameters are set to be
𝜆 = 20,𝑟1 = 3,𝑟2 = 12. In our implementation, we stop the optimiza-
tion process when all the handle points are no more than 𝑑 pixel
away from their corresponding target points, where 𝑑 is set to 1
for no more than 5 handle points and 2 otherwise. We also develop
a GUI to support interactive image manipulation. Thanks to the
computational efficiency of our approach, users only need to wait
for a few seconds for each edit and can continue the editing until
satisfied. We highly recommend readers refer to the supplemental
video for live recordings of interactive sessions.
4 EXPERIMENTS
Datasets. We evaluate our approach based on StyleGAN2 [Karras
et al. 2020] pretrained on the following datasets (the resolution of
the pretrained StyleGAN2 is shown in brackets): FFHQ (512) [Karras
et al. 2019], AFHQCat (512) [Choi et al. 2020], SHHQ (512) [Fu et al.
2022], LSUN Car (512) [Yu et al. 2015], LSUN Cat (256) [Yu et al.
2015], Landscapes HQ (256) [Skorokhodov et al. 2021], microscope
5
6. SIGGRAPH ’23 Conference Proceedings, August 6–10, 2023, Los Angeles, CA, USA X. Pan, A. Tewari, T. Leimkühler, L. Liu, A. Meka, C. Theobalt
Real image 4th Edit (expression)
1st Edit (pose) 2nd Edit (hair) 3rd Edit (shape)
GAN Inversion
Fig. 5. Real image manipulation. Given a real image, we apply GAN inversion to map it to the latent space of StyleGAN, then edit the pose, hair, shape, and
expression, respectively.
Input Ours PIPs RAFT
Manipulation
process
w/o Tracking
Fig. 6. Qualitative tracking comparison of our approach to RAFT [Teed and
Deng 2020], PIPs [Harley et al. 2022], and without tracking. Our approach
tracks the handle point more accurately than baselines, thus producing
more precise editing.
(512) [Pinkney 2020] and self-distilled dataset from [Mokady et al.
2022] including Lion (512), Dog (1024), and Elephant (512).
Baselines. Our main baseline is UserControllableLT [Endo 2022],
which has the closest setting with our method. UserControllableLT
does not support a mask input but allows users to define a number
of fixed points. Thus, for testing cases with a mask input, we sample
a regular 16 × 16 grid on the image and use the points outside the
mask as the fixed points to UserControllableLT. Besides, we also
compare with RAFT [Teed and Deng 2020] and PIPs [Harley et al.
2022] for point tracking. To do so, we create two variants of our
approach where the point tracking part (Sec.3.3) is replaced with
these two tracking methods.
4.1 Qualitative Evaluation
Fig. 4 shows the qualitative comparison between our method and
UserControllableLT. We show the image manipulation results for
several different object categories and user inputs. Our approach
accurately moves the handle points to reach the target points, achiev-
ing diverse and natural manipulation effects such as changing the
pose of animals, the shape of a car, and the layout of a landscape.
In contrast, UserControllableLT cannot faithfully move the handle
points to the targets and often leads to undesired changes in the
images, e.g., the clothes of the human and the background of the
car. It also does not keep the unmasked region fixed as well as ours,
as shown in the cat images. We show more comparisons in Fig. 10.
A comparison between our approach with PIPs and RAFT is
provided in Fig. 6. Our approach accurately tracks the handle point
above the nose of the lion, thus successfully driving it to the target
Input Target UserControllableLT Ours
Fig. 7. Face landmark manipulation. Compared to UserControl-
lableLT [Endo 2022], our method can manipulate the landmarks detected
from the input image to match the landmarks detected from the target
image with less matching error.
Table 1. Quantitative evaluation on face keypoint manipulation. We com-
pute the mean distance between edited points and target points. The FID
and Time are reported based on the ‘1 point’ setting.
Method 1 point 5 points 68 points FID Time (s)
No edit 12.93 11.66 16.02 - -
UserControllableLT 11.64 10.41 10.15 25.32 0.03
Ours w. RAFT tracking 13.43 13.59 15.92 51.37 15.4
Ours w. PIPs tracking 2.98 4.83 5.30 31.87 6.6
Ours 2.44 3.18 4.73 9.28 2.0
position. In PIPs and RAFT, the tracked point starts to deviate from
the nose during the manipulation process. Consequently, they move
the wrong part to the target position. When no tracking is performed,
the fixed handle point soon starts to drive another part of the image
(e.g., background) after a few steps and never knows when to stop,
which fails to achieve the editing goal.
Real image editing. Using GAN inversion techniques that embed
a real image in the latent space of StyleGAN, we can also apply
our approach to manipulate real images. Fig. 5 shows an example,
where we apply PTI inversion [Roich et al. 2022] to the real image
and then perform a series of manipulations to edit the pose, hair,
shape, and expression of the face in the image. We show more real
image editing examples in Fig. 13.
4.2 Quantitative Evaluation
We quantitatively evaluate our method under two settings, including
face landmark manipulation and paired image reconstruction.
Face landmark manipulation. Since face landmark detection is
very reliable using an off-the-shelf tool [King 2009], we use its
prediction as ground truth landmarks. Specifically, we randomly
generate two face images using the StyleGAN trained on FFHQ and
detect their landmarks. The goal is to manipulate the landmarks
6
7. Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold SIGGRAPH ’23 Conference Proceedings, August 6–10, 2023, Los Angeles, CA, USA
Table 2. Quantitative evaluation on paired image reconstruction. We follow the evaluation
in [Endo 2022] and report MSE (×102)↓ and LPIPS (×10)↓ scores.
Dataset Lion LSUN Cat Dog LSUN Car
Metric MSE LPIPS MSE LPIPS MSE LPIPS MSE LPIPS
UserControllableLT 1.82 1.14 1.25 0.87 1.23 0.92 1.98 0.85
Ours w. RAFT tracking 1.09 0.99 1.84 1.15 0.91 0.76 2.37 0.94
Ours w. PIPs tracking 0.80 0.82 1.11 0.85 0.78 0.63 1.81 0.79
Ours 0.66 0.72 1.04 0.82 0.48 0.44 1.67 0.74
Table 3. Effects of which feature to use. x+y means the con-
catenation of two features. We report the performance (MD)
of face landmark manipulation (1 point).
Block No. 4 5 6 7 5+6 6+7
Motion sup. 2.73 2.50 2.44 2.51 2.47 2.45
Tracking 3.61 2.55 2.44 2.58 2.47 2.45
Table 4. Effects of 𝑟1.
𝑟1 1 2 3 4 5
MD 2.49 2.51 2.44 2.45 2.46
w/ mask w/o mask
Fig. 8. Effects of the mask. Our approach allows masking the movable
region. After masking the head region of the dog, the rest part would be
almost unchanged.
of the first image to match the landmarks of the second image.
After manipulation, we detect the landmarks of the final image
and compute the mean distance (MD) to the target landmarks. The
results are averaged over 1000 tests. The same set of test samples is
used to evaluate all methods. In this way, the final MD score reflects
how well the method can move the landmarks to the target positions.
We perform the evaluation under 3 settings with different numbers
of landmarks including 1, 5, and 68 to show the robustness of our
approach under different numbers of handle points. We also report
the FID score between the edited images and the initial images as
an indication of image quality. In our approach and its variants, the
maximum optimization step is set to 300.
The results are provided in Table 1. Our approach significantly
outperforms UserControllableLT under different numbers of points.
A qualitative comparison is shown in Fig. 7, where our method
opens the mouth and adjusts the shape of the jaw to match the
target face while UserControllableLT fails to do so. Furthermore,
our approach preserves better image quality as indicated by the FID
scores. Thanks to a better tracking capability, we also achieve more
accurate manipulation than RAFT and PIPs. Inaccurate tracking
also leads to excessive manipulation, which deteriorates the image
quality as shown in FID scores. Although UserControllableLT is
faster, our approach largely pushes the upper bound of this task,
achieving much more faithful manipulation while maintaining a
comfortable running time for users.
Paired image reconstruction. In this evaluation, we follow the
same setting as UserControllableLT [Endo 2022]. Specifically, we
sample a latent code 𝒘1 and randomly perturb it to get 𝒘2 in the
same way as in [Endo 2022]. Let I1 and I2 be the StyleGAN images
generated from the two latent codes. We then compute the optical
flow between I1 and I2 and randomly sample 32 pixels from the flow
field as the user input U. The goal is to reconstruct I2 from I1 and
U. We report MSE and LPIPS [Zhang et al. 2018] and average the
results over 1000 samples. The maximum optimization step is set
to 100 in our approach and its variants. As shown in Table 2, our
approach outperforms all the baselines in different object categories,
which is consistent with previous results.
Fig. 9. Out-of-distribution manipulations. Our approach has extrapolation
capability for creating images out of the training image distribution, for
example, an extremely opened mouth and a greatly enlarged wheel.
Ablation Study. Here we study the effects of which feature to use
in motion supervision and point tracking. We report the perfor-
mance (MD) of face landmark manipulation using different features.
As Table 3 shows, in both motion supervision and point tracking,
the feature maps after the 6th block of StyleGAN perform the best,
showing the best balance between resolution and discriminative-
ness. We also provide the effects of 𝑟1 in Table 4. It can be observed
that the performance is not very sensitive to the choice of 𝑟1, and
𝑟1 = 3 performs slightly better.
4.3 Discussions
Effects of mask. Our approach allows users to input a binary
mask denoting the movable region. We show its effects in Fig. 8.
When a mask over the head of the dog is given, the other regions
are almost fixed and only the head moves. Without the mask, the
manipulation moves the whole dog’s body. This also shows that
point-based manipulation often has multiple possible solutions and
the GAN will tend to find the closest solution in the image manifold
learned from the training data. The mask function can help to reduce
ambiguity and keep certain regions fixed.
Out-of-distribution manipulation. So far, the point-based manipu-
lations we have shown are "in-distribution" manipulations, i.e., it
is possible to satisfy the manipulation requirements with a natural
image inside the image distribution of the training dataset. Here we
showcase some out-of-distribution manipulations in Fig. 9. It can be
seen that our approach has some extrapolation capability, creating
images outside the training image distribution, e.g., an extremely
opened mouth and a large wheel. In some cases, users may want to
always keep the image in the training distribution and prevent it
from reaching such out-of-distribution manipulations. A potential
way to achieve this is to add additional regularization to the latent
code 𝒘, which is not the main focus of this paper.
Limitations. Despite some extrapolation capability, our editing
quality is still affected by the diversity of training data. As exem-
plified in Fig. 14 (a), creating a human pose that deviates from the
training distribution can lead to artifacts. Besides, handle points in
texture-less regions sometimes suffer from more drift in tracking, as
7
8. SIGGRAPH ’23 Conference Proceedings, August 6–10, 2023, Los Angeles, CA, USA X. Pan, A. Tewari, T. Leimkühler, L. Liu, A. Meka, C. Theobalt
shown in Fig. 14 (b)(c). We thus suggest picking texture-rich handle
points if possible.
Social impacts. As our method can change the spatial attributes
of images, it could be misused to create images of a real person with
a fake pose, expression, or shape. Thus, any application or research
that uses our approach has to strictly respect personality rights and
privacy regulations.
5 CONCLUSION
We have presented DragGAN, an interactive approach for intuitive
point-based image editing. Our method leverages a pre-trained GAN
to synthesize images that not only precisely follow user input, but
also stay on the manifold of realistic images. In contrast to many
previous approaches, we present a general framework by not relying
on domain-specific modeling or auxiliary networks. This is achieved
using two novel ingredients: An optimization of latent codes that
incrementally moves multiple handle points towards their target
locations, and a point tracking procedure to faithfully trace the
trajectory of the handle points. Both components utilize the dis-
criminative quality of intermediate feature maps of the GAN to
yield pixel-precise image deformations and interactive performance.
We have demonstrated that our approach outperforms the state of
the art in GAN-based manipulation and opens new directions for
powerful image editing using generative priors. As for future work,
we plan to extend point-based editing to 3D generative models.
ACKNOWLEDGMENTS
Christian Theobalt was supported by ERC Consolidator Grant 4DReply
(770784). Lingjie Liu was supported by Lise Meitner Postdoctoral Fel-
lowship. This project was also supported by Saarbrücken Research
Center for Visual Computing, Interaction and AI.
REFERENCES
Rameen Abdal, Yipeng Qin, and Peter Wonka. 2019. Image2stylegan: How to embed
images into the stylegan latent space?. In ICCV.
Rameen Abdal, Peihao Zhu, Niloy J Mitra, and Peter Wonka. 2021. Styleflow: Attribute-
conditioned exploration of stylegan-generated images using conditional continuous
normalizing flows. ACM Transactions on Graphics (ToG) 40, 3 (2021), 1–21.
Thomas Brox and Jitendra Malik. 2010. Large displacement optical flow: descriptor
matching in variational motion estimation. IEEE transactions on pattern analysis
and machine intelligence 33, 3 (2010), 500–513.
Eric R. Chan, Connor Z. Lin, Matthew A. Chan, Koki Nagano, Boxiao Pan, Shalini De
Mello, Orazio Gallo, Leonidas Guibas, Jonathan Tremblay, Sameh Khamis, Tero
Karras, and Gordon Wetzstein. 2022. Efficient Geometry-aware 3D Generative
Adversarial Networks. In CVPR.
Eric R Chan, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, and Gordon Wetzstein.
2021. pi-gan: Periodic implicit generative adversarial networks for 3d-aware image
synthesis. In CVPR.
Anpei Chen, Ruiyang Liu, Ling Xie, Zhang Chen, Hao Su, and Jingyi Yu. 2022. Sofgan:
A portrait image generator with dynamic styling. ACM Transactions on Graphics
(TOG) 41, 1 (2022), 1–26.
Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. 2020. StarGAN v2: Diverse
Image Synthesis for Multiple Domains. In CVPR.
Edo Collins, Raja Bala, Bob Price, and Sabine Susstrunk. 2020. Editing in style: Uncov-
ering the local semantics of gans. In CVPR. 5771–5780.
Antonia Creswell, Tom White, Vincent Dumoulin, Kai Arulkumaran, Biswa Sengupta,
and Anil A Bharath. 2018. Generative adversarial networks: An overview. IEEE
signal processing magazine 35, 1 (2018), 53–65.
Yu Deng, Jiaolong Yang, Dong Chen, Fang Wen, and Xin Tong. 2020. Disentangled
and Controllable Face Image Generation via 3D Imitative-Contrastive Learning. In
CVPR.
Alexey Dosovitskiy, Philipp Fischer, Eddy Ilg, Philip Hausser, Caner Hazirbas, Vladimir
Golkov, Patrick Van Der Smagt, Daniel Cremers, and Thomas Brox. 2015. Flownet:
Learning optical flow with convolutional networks. In ICCV.
Yuki Endo. 2022. User-Controllable Latent Transformer for StyleGAN Image Layout
Editing. Computer Graphics Forum 41, 7 (2022), 395–406. https://doi.org/10.1111/
cgf.14686
Dave Epstein, Taesung Park, Richard Zhang, Eli Shechtman, and Alexei A Efros. 2022.
Blobgan: Spatially disentangled scene representations. In ECCV. 616–635.
Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Chen Qian, Chen-Change Loy,
Wayne Wu, and Ziwei Liu. 2022. StyleGAN-Human: A Data-Centric Odyssey of
Human Generation. In ECCV.
Partha Ghosh, Pravir Singh Gupta, Roy Uziel, Anurag Ranjan, Michael J Black, and
Timo Bolkart. 2020. GIF: Generative interpretable faces. In International Conference
on 3D Vision (3DV).
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil
Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In
NeurIPS.
Jiatao Gu, Lingjie Liu, Peng Wang, and Christian Theobalt. 2022. StyleNeRF: A Style-
based 3D-Aware Generator for High-resolution Image Synthesis. In ICLR.
Erik Härkönen, Aaron Hertzmann, Jaakko Lehtinen, and Sylvain Paris. 2020. GANSpace:
Discovering Interpretable GAN Controls. arXiv preprint arXiv:2004.02546 (2020).
Adam W. Harley, Zhaoyuan Fang, and Katerina Fragkiadaki. 2022. Particle Video
Revisited: Tracking Through Occlusions Using Point Trajectories. In ECCV.
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic
models. In NeurIPS.
Takeo Igarashi, Tomer Moscovich, and John F Hughes. 2005. As-rigid-as-possible shape
manipulation. ACM transactions on Graphics (TOG) 24, 3 (2005), 1134–1141.
Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, and
Thomas Brox. 2017. Flownet 2.0: Evolution of optical flow estimation with deep
networks. In CVPR.
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image
translation with conditional adversarial networks. In CVPR.
Tero Karras, Miika Aittala, Samuli Laine, Erik Härkönen, Janne Hellsten, Jaakko Lehti-
nen, and Timo Aila. 2021. Alias-Free Generative Adversarial Networks. In NeurIPS.
Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture
for generative adversarial networks. In CVPR. 4401–4410.
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila.
2020. Analyzing and improving the image quality of stylegan. In CVPR. 8110–8119.
Davis E. King. 2009. Dlib-ml: A Machine Learning Toolkit. Journal of Machine Learning
Research 10 (2009), 1755–1758.
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization.
arXiv preprint arXiv:1412.6980 (2014).
Thomas Leimkühler and George Drettakis. 2021. FreeStyleGAN: Free-view Editable
Portrait Rendering with the Camera Manifold. 40, 6 (2021). https://doi.org/10.1145/
3478513.3480538
Huan Ling, Karsten Kreis, Daiqing Li, Seung Wook Kim, Antonio Torralba, and Sanja
Fidler. 2021. Editgan: High-precision semantic image editing. In NeurIPS.
Ron Mokady, Omer Tov, Michal Yarom, Oran Lang, Inbar Mosseri, Tali Dekel, Daniel
Cohen-Or, and Michal Irani. 2022. Self-distilled stylegan: Towards generation from
internet photos. In ACM SIGGRAPH 2022 Conference Proceedings. 1–9.
Xingang Pan, Xudong Xu, Chen Change Loy, Christian Theobalt, and Bo Dai. 2021. A
Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image
Synthesis. In NeurIPS.
Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. 2019. Semantic image
synthesis with spatially-adaptive normalization. In CVPR.
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary
DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Auto-
matic differentiation in PyTorch. (2017).
Or Patashnik, Zongze Wu, Eli Shechtman, Daniel Cohen-Or, and Dani Lischinski. 2021.
Styleclip: Text-driven manipulation of stylegan imagery. In ICCV.
Justin N. M. Pinkney. 2020. Awesome pretrained StyleGAN2. https://github.com/
justinpinkney/awesome-pretrained-stylegan2.
Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. 2022.
Hierarchical text-conditional image generation with clip latents. arXiv preprint
arXiv:2204.06125 (2022).
Daniel Roich, Ron Mokady, Amit H Bermano, and Daniel Cohen-Or. 2022. Pivotal
tuning for latent-based editing of real images. ACM Transactions on Graphics (TOG)
42, 1 (2022), 1–13.
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn
Ommer. 2021. High-Resolution Image Synthesis with Latent Diffusion Models.
arXiv:2112.10752 [cs.CV]
Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton,
Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S Sara Mahdavi, Rapha Gon-
tijo Lopes, et al. 2022. Photorealistic Text-to-Image Diffusion Models with Deep
Language Understanding. arXiv preprint arXiv:2205.11487 (2022).
Katja Schwarz, Yiyi Liao, Michael Niemeyer, and Andreas Geiger. 2020. GRAF: Genera-
tive Radiance Fields for 3D-Aware Image Synthesis. In NeurIPS.
Yujun Shen, Jinjin Gu, Xiaoou Tang, and Bolei Zhou. 2020. Interpreting the latent space
of gans for semantic face editing. In CVPR.
8
9. Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold SIGGRAPH ’23 Conference Proceedings, August 6–10, 2023, Los Angeles, CA, USA
Yujun Shen and Bolei Zhou. 2020. Closed-Form Factorization of Latent Semantics in
GANs. arXiv preprint arXiv:2007.06600 (2020).
Ivan Skorokhodov, Grigorii Sotnikov, and Mohamed Elhoseiny. 2021. Aligning Latent
and Image Spaces to Connect the Unconnectable. arXiv preprint arXiv:2104.06954
(2021).
Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015.
Deep unsupervised learning using nonequilibrium thermodynamics. In International
Conference on Machine Learning. PMLR, 2256–2265.
Jiaming Song, Chenlin Meng, and Stefano Ermon. 2020. Denoising Diffusion Implicit
Models. In ICLR.
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Er-
mon, and Ben Poole. 2021. Score-Based Generative Modeling through Stochastic
Differential Equations. In International Conference on Learning Representations.
Narayanan Sundaram, Thomas Brox, and Kurt Keutzer. 2010. Dense point trajectories
by gpu-accelerated large displacement optical flow. In ECCV.
Ryohei Suzuki, Masanori Koyama, Takeru Miyato, Taizan Yonetsuji, and Huachun Zhu.
2018. Spatially controllable image synthesis with internal representation collaging.
arXiv preprint arXiv:1811.10153 (2018).
Zachary Teed and Jia Deng. 2020. Raft: Recurrent all-pairs field transforms for optical
flow. In ECCV.
Ayush Tewari, MalliKarjun B R, Xingang Pan, Ohad Fried, Maneesh Agrawala, and
Christian Theobalt. 2022. Disentangled3D: Learning a 3D Generative Model with
Disentangled Geometry and Appearance from Monocular Images. In CVPR.
Ayush Tewari, Mohamed Elgharib, Gaurav Bharaj, Florian Bernard, Hans-Peter Seidel,
Patrick Pérez, Michael Zollhofer, and Christian Theobalt. 2020. StyleRig: Rigging
StyleGAN for 3D Control over Portrait Images. In CVPR.
Nontawat Tritrong, Pitchaporn Rewatbowornwong, and Supasorn Suwajanakorn. 2021.
Repurposing gans for one-shot semantic part segmentation. In Proceedings of the
IEEE/CVF conference on computer vision and pattern recognition. 4475–4485.
Jianyuan Wang, Ceyuan Yang, Yinghao Xu, Yujun Shen, Hongdong Li, and Bolei Zhou.
2022b. Improving gan equilibrium by raising spatial awareness. In CVPR. 11285–
11293.
Sheng-Yu Wang, David Bau, and Jun-Yan Zhu. 2022a. Rewriting Geometric Rules of a
GAN. ACM Transactions on Graphics (TOG) (2022).
Yinghao Xu, Sida Peng, Ceyuan Yang, Yujun Shen, and Bolei Zhou. 2022. 3D-aware
Image Synthesis via Learning Structural and Textural Representations. In CVPR.
Fisher Yu, Ari Seff, Yinda Zhang, Shuran Song, Thomas Funkhouser, and Jianxiong
Xiao. 2015. Lsun: Construction of a large-scale image dataset using deep learning
with humans in the loop. arXiv preprint arXiv:1506.03365 (2015).
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018.
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In CVPR.
Yuxuan Zhang, Huan Ling, Jun Gao, Kangxue Yin, Jean-Francois Lafleche, Adela Bar-
riuso, Antonio Torralba, and Sanja Fidler. 2021. DatasetGAN: Efficient Labeled Data
Factory with Minimal Human Effort. In CVPR.
Jiapeng Zhu, Ceyuan Yang, Yujun Shen, Zifan Shi, Deli Zhao, and Qifeng Chen. 2023.
LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis. arXiv
preprint arXiv:2301.04604 (2023).
Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, and Alexei A Efros. 2016. Generative
visual manipulation on the natural image manifold. In ECCV.
9
10. SIGGRAPH ’23 Conference Proceedings, August 6–10, 2023, Los Angeles, CA, USA X. Pan, A. Tewari, T. Leimkühler, L. Liu, A. Meka, C. Theobalt
Inputs
Ours
UserControllableLT
Inputs
Ours
UserControllableLT
Fig. 10. Qualitative comparison. This is an extension of Fig. 4.
Input Target Ours Input Target Ours
Fig. 11. Face landmark manipulation. Our method works well even for such dense keypoint cases.
10
11. Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold SIGGRAPH ’23 Conference Proceedings, August 6–10, 2023, Los Angeles, CA, USA
1st Edit (foot) 2nd Edit (mouth) 3rd Edit (ears)
Fig. 12. Continuous image manipulation. Users can continue the manipulation based on previous manipulation results.
Real image 1st Edit (hair) 2nd Edit (expression) 3rd Edit (pose)
GAN Inversion
GAN Inversion GAN Inversion
GAN Inversion
GAN Inversion
Fig. 13. Real image manipulation.
(b) Texture-less handle point (c) Texture-rich handle point
(a) Out-of-distribution pose
Fig. 14. Limitations. (a) the StyleGAN-human [Fu et al. 2022] is trained on a fashion dataset where most arms and legs are downward. Editing toward
out-of-distribution poses can cause distortion artifacts as shown in the legs and hands. (b)&(c) The handle point (red) in texture-less regions may suffer from
more drift during tracking, as can be observed from its relative position to the rearview mirror.
Fig. 15. Effects of the mask. By masking the foreground object, we can fix the back-
ground. The details of the trees and grasses are kept nearly unchanged. Better back-
ground preservation could potentially be achieved via feature blending [Suzuki et al.
2018].
Input W+ W
Fig. 16. Effects of W/W+ space. Optimizing the latent code in W+
space is easier to achieve out-of-distribution manipulations such as
closing only one eye of the cat. In contrast, W space struggles to
achieve this as it tends to keep the image within the distribution of
training data.
11