I heard a new term today I hadn't heard before which is Generative AI.
It appears to mean things that AI generates itself rather than doing just calculations like math and stuff like that. So, generative AI is sort of like composing Term Papers, Drawing original pictures, making it's own movies or video content, writing it's own books etc. etc. etc.
I could give you various interpretations of Generative AI but most of them are very proprietary in nature done by companies interested in making money off of Generative AI. So, I decided to give you a relatively unbiased view (in that almost anyone knowledgeable can write about Generative AI at Wikipedia. It might not be perfect either but at least you can get the general Idea without it being as proprietary.
Begin quote from:
https://en.wikipedia.org/wiki/Generative_artificial_intelligence
Contents
Generative artificial intelligence
It has been suggested that Synthetic media be merged into this article. (Discuss) Proposed since July 2023. |
Generative artificial intelligence (generative AI or GenAI[1]) is artificial intelligence capable of generating text, images, or other media, using generative models.[2][3][4] Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics.[5][6]
In the early 2020s, advances in transformer-based deep neural networks enabled a number of generative AI systems notable for accepting natural language prompts as input. These include large language model chatbots such as ChatGPT, Copilot, Bard, and LLaMA, and text-to-image artificial intelligence art systems such as Stable Diffusion, Midjourney, and DALL-E.[7][8][9]
Generative AI has uses across a wide range of industries, including art, writing, script writing, software development, product design, healthcare, finance, gaming, marketing, and fashion.[10][11][12] Investment in generative AI surged during the early 2020s, with large companies such as Microsoft, Google, and Baidu as well as numerous smaller firms developing generative AI models.[2][13][14] However, there are also concerns about the potential misuse of generative AI, including cybercrime or creating fake news or deepfakes which can be used to deceive or manipulate people.[15][16]
History
The academic discipline of artificial intelligence was founded at a research workshop at Dartmouth College in 1956, and has experienced several waves of advancement and optimism in the decades since.[17] Since its founding, researchers in the field have raised philosophical and ethical arguments about the nature of the human mind and the consequences of creating artificial beings with human-like intelligence; these issues have previously been explored by myth, fiction and philosophy since antiquity.[18] These concepts of automated art date at least to the automata of ancient Greek civilization, where inventors such as Daedalus and Hero of Alexandria were described as having designed machines capable of writing text, generating sounds, and playing music.[19][20] The tradition of creative automatons has flourished throughout history, such as Maillardet's automaton, created in the early 1800s.[21]
Artificial Intelligence is an idea that has been captivating society since the mid-20th century. It began with science fiction familiarizing the world with the concept but the idea wasn't fully seen in the scientific manner until Alan Turing, a polymath, was curious about the feasibility of the concept. Turing's groundbreaking 1950 paper, "Computing Machinery and Intelligence," posed fundamental questions about machine reasoning similar to human intelligence, significantly contributing to the conceptual groundwork of AI. The development of AI was not very rapid at first because of the high costs and the fact that computers were not able to store commands. This changed during the 1956 Dartmouth Summer Research Project on AI where there was an inspiring call for AI research which led it to be a landmark event as it set the precedent for two decades of rapid advancements in the field.[22]
Since the founding of AI in the 1950s, artists and researchers have used artificial intelligence to create artistic works. By the early 1970s, Harold Cohen was creating and exhibiting generative AI works created by AARON, the computer program Cohen created to generate paintings.[23]
Markov chains have long been used to model natural languages since their development by Russian mathematician Andrey Markov in the early 20th century. Markov published his first paper on the topic in 1906,[24][25][26] and analyzed the pattern of vowels and consonants in the novel Eugeny Onegin using Markov chains. Once a Markov chain is learned on a text corpus, it can then be used as a probabilistic text generator.[27][28]
The field of machine learning often uses statistical models, including generative models, to model and predict data. Beginning in the late 2000s, the emergence of deep learning drove progress and research in image classification, speech recognition, natural language processing and other tasks. Neural networks in this era were typically trained as discriminative models, due to the difficulty of generative modeling.[29]
In 2014, advancements such as the variational autoencoder and generative adversarial network produced the first practical deep neural networks capable of learning generative, rather than discriminative, models of complex data such as images. These deep generative models were the first able to output not only class labels for images, but to output entire images.
In 2017, the Transformer network enabled advancements in generative models compared to older Long-Short Term Memory models,[30] leading to the first generative pre-trained transformer (GPT), known as GPT-1, in 2018.[31] This was followed in 2019 by GPT-2 which demonstrated the ability to generalize unsupervised to many different tasks as a Foundation model.[32]
In 2021, the release of DALL-E, a transformer-based pixel generative model, followed by Midjourney and Stable Diffusion marked the emergence of practical high-quality artificial intelligence art from natural language prompts.
In March 2023, GPT-4 was released. A team from Microsoft Research argued that "it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system".[33] Other scholars have disputed that GPT-4 reaches this threshold, calling generative AI "still far from reaching the benchmark of ‘general human intelligence’" as of 2023.[34]
Modalities
A generative AI system is constructed by applying unsupervised or self-supervised machine learning to a data set. The capabilities of a generative AI system depend on the modality or type of the data set used.
Generative AI can be either unimodal or multimodal; unimodal systems take only one type of input, whereas multimodal systems can take more than one type of input.[35] For example, one version of OpenAI's GPT-4 accepts both text and image inputs.[36]
Text
Generative AI systems trained on words or word tokens include GPT-3, LaMDA, LLaMA, BLOOM, GPT-4, and others (see List of large language models). They are capable of natural language processing, machine translation, and natural language generation and can be used as foundation models for other tasks.[37] Data sets include BookCorpus, Wikipedia, and others (see List of text corpora).
Code
In addition to natural language text, large language models can be trained on programming language text, allowing them to generate source code for new computer programs.[38] Examples include OpenAI Codex.
Images
Producing high-quality visual art is a prominent application of generative AI.[39] Many such artistic works have received public awards and recognition.
Generative AI systems trained on sets of images with text captions include Imagen, DALL-E, Midjourney, Adobe Firefly, Stable Diffusion and others (see Artificial intelligence art, Generative art, and Synthetic media). They are commonly used for text-to-image generation and neural style transfer.[40] Datasets include LAION-5B and others (See List of datasets in computer vision and image processing).
Audio
Generative AI can also be trained extensively on audio clips to produce natural-sounding speech synthesis and text-to-speech capabilities, exemplified by ElevenLabs' context-aware synthesis tools or Meta Platform's Voicebox.[41]
Generative AI systems such as MusicLM[42] and MusicGen[43] can also be trained on the audio waveforms of recorded music along with text annotations, in order to generate new musical samples based on text descriptions such as a calming violin melody backed by a distorted guitar riff.
Video
Generative AI trained on annotated video can generate temporally-coherent video clips. Examples include Gen-1 and Gen-2 by Runway[44] and Make-A-Video by Meta Platforms.[45]
Molecules
Generative AI systems can be trained on sequences of amino acids or molecular representations such as SMILES representing DNA or proteins. These systems, such as AlphaFold, are used for protein structure prediction and drug discovery.[46] Datasets include various biological datasets.
Robotics
Generative AI can also be trained on the motions of a robotic system to generate new trajectories for motion planning or navigation. For example, UniPi from Google Research uses prompts like "pick up blue bowl" or "wipe plate with yellow sponge" to control movements of a robot arm.[47] Multimodal "vision-language-action" models such as Google's RT-2 can perform rudimentary reasoning in response to user prompts and visual input, such as picking up a toy dinosaur when given the prompt pick up the extinct animal at a table filled with toy animals and other objects.[48]
Planning
The terms generative AI planning or generative planning were used in the 1980s and 1990s to refer to AI planning systems, especially computer-aided process planning, used to generate sequences of actions to reach a specified goal.[49][50]
Generative AI planning systems used symbolic AI methods such as state space search and constraint satisfaction and were a "relatively mature" technology by the early 1990s. They were used to generate crisis action plans for military use,[51] process plans for manufacturing[49] and decision plans such as in prototype autonomous spacecraft.[52]
Software and hardware
Generative AI models are used to power chatbot products such as ChatGPT, programming tools such as GitHub Copilot,[53] text-to-image products such as Midjourney, and text-to-video products such as Runway Gen-2.[54] Generative AI features have been integrated into a variety of existing commercially available products such as Microsoft Office,[55] Google Photos,[56] and Adobe Photoshop.[57] Many generative AI models are also available as open-source software, including Stable Diffusion and the LLaMA[58] language model.
Smaller generative AI models with up to a few billion parameters can run on smartphones, embedded devices, and personal computers. For example, LLaMA-7B (a version with 7 billion parameters) can run on a Raspberry Pi 4[59] and one version of Stable Diffusion can run on an iPhone 11.[60]
Larger models with tens of billions of parameters can run on laptop or desktop computers. To achieve an acceptable speed, models of this size may require accelerators such as the GPU chips produced by Nvidia and AMD or the Neural Engine included in Apple silicon products. For example, the 65 billion parameter version of LLaMA can be configured to run on a desktop PC.[61]
Language models with hundreds of billions of parameters, such as GPT-4 or PaLM, typically run on datacenter computers equipped with arrays of GPUs (such as Nvidia's H100) or AI accelerator chips (such as Google's TPU). These very large models are typically accessed as cloud services over the Internet.
In 2022, the United States New Export Controls on Advanced Computing and Semiconductors to China imposed restrictions on exports to China of GPU and AI accelerator chips used for generative AI.[62] Chips such as the Nvidia A800[63] and the Biren Technology BR104[64] were developed to meet the requirements of the sanctions.
There is free software on the market capable of recognizing text generated by generative artificial intelligence (such as GPTZero), as well as images, audio or video coming from it.[65]
Concerns
The impact of AI on numerous industries has been profound, revolutionizing productivity, decision-making processes, and customer experiences. However, amidst this progress, challenges and concerns have emerged.
The development of generative AI has raised concerns from governments, businesses, and individuals, resulting in protests, legal actions, calls to pause AI experiments, and actions by multiple governments. In a July 2023 briefing of the United Nations Security Council, Secretary-General António Guterres stated "Generative AI has enormous potential for good and evil at scale", that AI may "turbocharge global development" and contribute between $10 and $15 trillion to the global economy by 2030, but that its malicious use "could cause horrific levels of death and destruction, widespread trauma, and deep psychological damage on an unimaginable scale".[66]
Job losses
From the early days of the development of AI, there have been arguments put forward by ELIZA creator Joseph Weizenbaum and others about whether tasks that can be done by computers actually should be done by them, given the difference between computers and humans, and between quantitative calculations and qualitative, value-based judgements.[68] In April 2023, it was reported that image generation AI has resulted in 70% of the jobs for video game illustrators in China being lost.[69][70] In July 2023, developments in generative AI contributed to the 2023 Hollywood labor disputes. Fran Drescher, president of the Screen Actors Guild, declared that "artificial intelligence poses an existential threat to creative professions" during the 2023 SAG-AFTRA strike.[71] Voice generation AI has been seen as a potential challenge to the voice acting sector.[72][73]
The intersection of AI and employment concerns among underrepresented groups globally remains a critical facet. While AI promises efficiency enhancements and skill acquisition, concerns about job displacement and biased recruiting processes persist among these groups, as outlined in surveys by Fast Company. To leverage AI for a more equitable society, proactive steps encompass mitigating biases, advocating transparency, respecting privacy and consent, and embracing diverse teams and ethical considerations. Strategies involve redirecting policy emphasis on regulation, inclusive design, and education's potential for personalized teaching to maximize benefits while minimizing harms.[74]
Finance
In the financial realm, significant investment surges, as highlighted in discussions by Daron Acemoglu, have led to transformative tools like robo-advisors, reshaping traditional financial practices. Acemoglu's warnings about potential adverse societal consequences stemming from AI, particularly in data harvesting, customer manipulation, and labor market disparities, underscore the complexities of AI's impact on society.[75]
Social Identities
AI's integration with social identities, elucidated by Marcin Frackiewiczin, holds both promises and challenges. AI's ability to transform traditional research methods, unveiling nuanced patterns within Social identity realms, has immense potential. However, biases ingrained in AI systems perpetuate stereotypes and marginalize groups, emphasizing the critical need to address these biases for inclusivity.[76]
Deepfakes
Deepfakes (a portmanteau of "deep learning" and "fake"[77]) are AI-generated media that take a person in an existing image or video and replace them with someone else's likeness using artificial neural networks.[78] Deepfakes have garnered widespread attention and concerns for their uses in deepfake celebrity pornographic videos, revenge porn, fake news, hoaxes, and financial fraud.[79][80][81][82] This has elicited responses from both industry and government to detect and limit their use.[83][84]
Audio deepfakes
Instances of users abusing software to generate controversial statements in the vocal style of celebrities, public officials, and other famous individuals have raised ethical concerns over voice generation AI.[85][86][87][88][89][90] In response, companies such as ElevenLabs have stated that they would work on mitigating potential abuse through safeguards and identity verification.[91]
Concerns and fandom have spawned from AI generated music. The same software used to clone voices have been used on famous musicians' voices to create songs which mimic their voices, gaining both tremendous popularity and criticism.[92][93][94] Similar techniques have also been used to create improved quality or full-length versions of songs that have been leaked or have yet to been released.[95]
Generative AI has also been used to create new digital artist personalities, with some of these receiving enough attention to receive record deals at major labels.[96] The developers of these virtual artists have also faced their fair share of criticism for their personified programs, including backlash for "dehumanizing" an artform, and also creating artists which create unrealistic or immoral appeals to their audiences.[97]
Cybercrime
Generative AI's ability to create realistic fake content has been exploited in numerous types of cybercrime, including phishing scams.[98] Deepfake video and audio have been used to create disinformation and fraud. Former Google fraud czar Shuman Ghosemajumder has predicted that while deepfake videos initially created a stir in the media, they would soon become commonplace, and as a result, more dangerous.[99] Additionally, large-language models and other forms of text-generation AI have been at a broad scale to create fake reviews on ecommerce websites to boost ratings.[100] Cybercriminals have created large language models focused on fraud, including WormGPT and FraudGPT.[101]
Recent research done in 2023 has revealed that generative AI has weaknesses that can be manipulated by criminals to extract harmful information bypassing ethical safeguards. The study presents example attacks done on ChatGPT including Jailbreaks and reverse psychology. Additionally, malicious individuals can use ChatGPT for social engineering attacks and phishing attacks, revealing the harmful side of these technologies.[102]
Misuse in journalism
In January 2023, Futurism.com broke the story that CNET had been using an undisclosed internal AI tool to write at least 77 of its stories; after the news broke, CNET posted corrections to 41 of the stories.[103]
In April 2023, German tabloid Die Aktuelle published a fake AI-generated interview with former racing driver Michael Schumacher, who had not made any public appearances since 2013 after sustaining a brain injury in a skiing accident. The story included two possible disclosures: the cover included the line "deceptively real", and the interview included an acknowledgement at the end that it was AI-generated. The editor-in-chief was fired shortly thereafter amid the controversy.[104]
Regulation
In the European Union, the proposed Artificial Intelligence Act includes requirements to disclose copyrighted material used to train generative AI systems, and to label any AI-generated output as such.[105][106]
In the United States, a group of companies including OpenAI, Alphabet, and Meta signed a voluntary agreement with the White House in July 2023 to watermark AI-generated content.[107]
In China, the Interim Measures for the Management of Generative AI Services introduced by the Cyberspace Administration of China regulates any public-facing generative AI. It includes requirements to watermark generated images or videos, regulations on training data and label quality, restrictions on personal data collection, and a guideline that generative AI must "adhere to socialist core values".[108][109]
See also
- Artificial general intelligence – Hypothetical human-level or stronger AI
- Artificial imagination – Artificial simulation of human imagination
- Artificial intelligence art – Machine application of knowledge of human aesthetic expressions
- Computational creativity – Multidisciplinary endeavour
- Generative adversarial network – Deep learning method
- Generative pre-trained transformer – Type of large language model
- Large language model – Neural network with billions of weights
- Music and artificial intelligence – Common subject in the International Computer Music Conference
- Procedural generation – Method in which data is created algorithmically as opposed to manually
- Stochastic parrot – Term used in machine learning
References
{{cite journal}}
: Cite journal requires |journal=
(help)
Our deliberator is a traditional generative AI planner based on the HSTS planning framework (Muscettola, 1994), and our control component is a traditional spacecraft attitude control system (Hackney et al. 1993). We also add an architectural component explicitly dedicated to world modeling (the mode identifier), and distinguish between control and monitoring.
Text-to-video is the next frontier for generative AI, though current output is rudimentary. Runway says it'll be making its new generative video model, Gen-2, available to users in 'the coming weeks.'
Microsoft is bringing generative artificial intelligence technologies such as the popular ChatGPT chatting app to its Microsoft 365 suite of business software....the new A.I. features, dubbed Copilot, will be available in some of the company's most popular business apps, including Word, PowerPoint and Excel.
The Google Photos app is getting a redesigned, AI-powered Memories feature...you'll be able to use generative AI to come up with some suggested names like "a desert adventure".
Generative artificial intelligence (AI) will become one of the most important features for creative designers and marketers. Adobe on Tuesday unveiled a Generative Fill feature in Photoshop to bring Firefly's AI capabilities into design.
If you want to run LLaMA 2 on your own machine or modify the code, you can download it directly from Hugging Face, a leading platform for sharing AI models.
Using a Pi 4 with 8GB of RAM, you can create a ChatGPT-like server based on LLaMA.
Draw Things is an app that brings Stable Diffusion to the iPhone. The AI images are generated locally, so you don't need an Internet connection.
To run LLaMA model at home, you will need a computer build with a powerful GPU that can handle the large amount of data and computation required for inferencing.
the A800 operates at 70% of the speed of A100 GPUs while complying with strict U.S. export standards that limit how much processing power Nvidia can sell.
SAG-AFTRA has joined the Writer's [sic] Guild of America in demanding a contract that explicitly demands AI regulations to protect writers and the works they create. ... The future of generative artificial intelligence in Hollywood—and how it can be used to replace labor—has become a crucial sticking point for actors going on strike. In a news conference Thursday, Fran Drescher, president of the Screen Actors Guild-American Federation of Television and Radio Artists (more commonly known as SAG-AFTRA), declared that 'artificial intelligence poses an existential threat to creative professions, and all actors and performers deserve contract language that protects them from having their identity and talent exploited without consent and pay.'
No comments:
Post a Comment