Artificial intelligence
NCSOFT unveils AI dataset to rival hyperscale language models
FoCus Dataset is the first of its kind, utilizing both user personas and outside knowledge
By Apr 14, 2022 (Gmt+09:00)
2
Min read
Most Read
LG Chem to sell water filter business to Glenwood PE for $692 million


Kyobo Life poised to buy Japan’s SBI Group-owned savings bank


KT&G eyes overseas M&A after rejecting activist fund's offer


StockX in merger talks with Naver’s online reseller Kream


Mirae Asset to be named Korea Post’s core real estate fund operator



The South Korean game developer and publisher headquartered in Pangyo city is positioning the latest development as the much-awaited rival to the hyperscale language models dominating the natural language processing (NLP) field.
Lim Hui-seok, a professor of computer science and engineering at the university, led the research. Lim also heads the academic institute’s NLP and AI research center.
The collection of data is named FoCus Dataset, a short form of For Customized Conversation Dataset.
The research team says it is the first such dataset that encompasses both user persona and outside knowledge. As it stands, it is comprised of more than 15,000 conversations on some 8,000 subjects.
An AI that is equipped with the FoCus Dataset will be able to comprehend the experience and preferences of the person with whom it is having a conversation. Not only that, it will be able to source and learn the latest information available on Wikipedia in real-time.
The collection and utilization of language data for AI adaptation falls in the NLP category. The goal of the machine learning technology is to program computers to process and analyze large amounts of the language spoken by humans for seamless communication between machines and people.
In this process, a persona refers to a profile that represents large segments of data since it is easier to test a given strategy against an average of different individuals, i.e. a persona, as opposed to thousands of individuals.
What sets FoCus Dataset apart from other data collections is that it can enable sophisticated conversations without the help of hyperscale language models.
Even though typical large-scale language models take a long time to learn and deduct meaning from, they still hit a bottleneck when it comes to inferring real-time data and reflecting personal experiences.
In late February, NCSOFT and Korea University jointly published a paper on the dataset at the AAAI 2022 conference. Founded in 1979, the Association for the Advancement of Artificial Intelligence is one of the highest-regarded scientific societies in the AI community.
Come this October, the two entities will host the first workshop on the customized chat technology at COLING 2022, an international conference on computational linguistics.
“Recently in the NLP academic circle, the need for alternative conversation technologies that will rival hyperscale language models has risen – for financial and environmental reasons,” Lee Yeon-soo, director of NCSOFT’s Language AI Lab said.
The lead scientist at NCSOFT elaborated that he hopes the dataset will spark vibrant conversation and technological development within the NLP sector.
NCSOFT is best known for the distribution of massively multiplayer online role-playing games (MMORPGs) such as Lineage and Guild Wars. In recent years, it has been expanding its foothold in other tech sectors.
Write to Jee Abbey Lee at jal@hankyung.com
More to Read
-
Tech, Media & TelecomNaver aims to attract 1 bn users via active M&As by 2027
Apr 13, 2022 (Gmt+09:00)
5 Min read -
Korean gamesNetmarble, NCSoft, Kakao, DoubleUGames on top publisher list
Apr 12, 2022 (Gmt+09:00)
1 Min read -
-
Artificial intelligenceLG AI Research forms 'Hyperscale' AI alliance with 13 companies
Feb 23, 2022 (Gmt+09:00)
4 Min read -
Artificial intelligenceLG's AI designer debuts at New York Fashion Week
Feb 16, 2022 (Gmt+09:00)
2 Min read -
Artificial intelligenceNaver recruits two AI scholars based in US
Jan 25, 2022 (Gmt+09:00)
3 Min read
Comment 0
LOG IN