Join me to get your feet wet with thousands of models available on Hugging Face! Hugging Face is like a CRAN of pre-trained AI/ML models. There are thousands of pre-trained models that can be imported and used within seconds at no charge to achieve tasks like text generation, text classification, translation, speech recognition, image classification, object detection, etc. In this post, I am exploring how to access these pre-trained models without leaving the comfort of RStudio using the reticulate
package.
Join me to get your feet wet with thousands of models available on Hugging Face! Hugging Face is like a CRAN of pre-trained AI/ML models. There are thousands of pre-trained models that can be imported and used within seconds at no charge to achieve tasks like text generation, text classification, translation, speech recognition, image classification, object detection, etc. In this post, I am exploring how to access these pre-trained models without leaving the comfort of RStudio using the reticulate
package.
reticulate
packageThe reticulate
package provides an interface to call and run Python from R. There is an excellent website with many details about this package, so I will not repeat the same information. I would recommend spending some time on this website and installing the reticulate package following the instructions on the website. In addition, I also had Anaconda software previously installed on my computer, and it already has Python.
First, install the reticulate package and Anaconda. Then, load the reticulate package and check the default python configurations.
library(reticulate)
py_config()
python: C:/Users/cengiz/Anaconda3/python.exe
libpython: C:/Users/cengiz/Anaconda3/python38.dll
pythonhome: C:/Users/cengiz/Anaconda3
version: 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)]
Architecture: 64bit
numpy: C:/Users/cengiz/Anaconda3/Lib/site-packages/numpy
numpy_version: 1.19.2
NOTE: Python version was forced by RETICULATE_PYTHON
Currently, it is set to use Python 3.8, which came with Anaconda. If you see nothing when you run py_config()
, you need to install Python. If you don’t have Anaconda or Python installed on your computer, you may check the ?install_miniconda
function. You can install Python directly by using this function, and it creates a default virtual Python environment (r-reticulate) you can use to import Python modules moving forward. If you have never done this, a set of useful functions to check are listed below.
?install_miniconda
?conda_list
?conda_install ?use_condaenv
conda_list
returns the available Python environments created before on your computer.
conda_list()
name
1 Anaconda3
2 r-reticulate
python
1 C:\\Users\\cengiz\\Anaconda3\\python.exe
2 C:\\Users\\cengiz\\AppData\\Local\\r-miniconda\\envs\\r-reticulate\\python.exe
The output indicates that there are two Python environments on my computer. The first one was the base Python environment created when I installed Anaconda. The second one was the r-reticulate environment when I installed r-miniconda along with the reticulate
package. I will be using the base environment that comes with Anaconda. So, I declare it below using the use_condaenv
function.
use_condaenv('Anaconda3')
All Python modules I will need moving forward will be installed in this environment. You can install Python modules using the ?conda_install
function, similar to the install.packages()
command used while installing a new R library. Below is an example for installing the transformers
module, and I will list all other modules needed for the rest of this post. These modules should all be installed, so the rest of the code in this post work.
# Install the python module to your specified Python environment
conda_install(envname = 'Anaconda3',
packages = 'transformers',
pip=TRUE)
# List of Python modules needed in this post
# transformers
# torch
# torchvision
# numpy
# PIL
# librosa
# requests
# timm
# detoxify
I will also use the magick, kable, and kableExtra packages in R at the end while dealing with the object detection task.
# Install packages
install.packages('magick')
install.packages('kable')
install.packages('kableExtra')
If you go to the Models tab of Hugging Face, there are more than 27,000 models available to use. I find it similar to the CRAN repository for the R packages, except it is for pre-trained AI/ML models. Some of these models probably cost tens of thousands of dollars, if not more.
These models are currently grouped in three major areas (Natural Language Processing, Audio, and Computer Vision), and each area has multiple tags for a different type of task.
You can filter a specific subset of models developed for that specific tag when you click a tag. In this post, I checked the most downloaded model for each tag and tried to reproduce an example for using this model to accomplish the task. Some of them were straightforward, but some required extra effort to search the web due to the lack of documentation. I couldn’t find much information for specific tasks and models, so I abandoned some of these tasks. At the end of each demo, I provide the links for the pages I learned as I try to reproduce the examples.
I am not an expert in any of these models and am a very beginner Python user. I will try to explain things as much as possible, and please take my explanations with a grain of salt. As I started this post, my original intention was only to reproduce some examples.
The Fill-Mask task is used to provide partial information in a text and ask the NLP model to complete the sentence for you. For instance, you can write a sentence like the following:
Istanbul is the _____ of Turkey.
There are many pre-trained NLP models for any NLP task, and I will use roberta-base
for this task. First, we import the python libraries and then define tokenizer
and model
as two objects in the R environment. These objects will be downloaded to your computer when you first use them. Note that some of these models are very big, so make sure you have enough space on your computer.
<- import('transformers')
transformers <- import('torch')
torch
<- transformers$AutoTokenizer$from_pretrained('roberta-base')
tokenizer
<- transformers$AutoModelForMaskedLM$from_pretrained('roberta-base') model
Each NLP model may have a different format for the masked word. So, it is a good idea to check the mask token.
$mask_token tokenizer
[1] "<mask>"
$mask_token_id tokenizer
[1] 50264
Now, we can prepare the input text accordingly and tokenize it.
<- 'Istanbul is the <mask> of Turkey.'
txt
<- tokenizer$encode(text = txt,return_tensors="pt")
input input
tensor([[ 0, 100, 46770, 16, 5, 50264, 9, 2769, 4, 2]])
$shape input
torch.Size([1, 10])
This process encodes the words in our sentence (and some other hidden tokens) to their numeric representations. For instance, the numeric code for the mask token is 50264, or the numeric code for “_the” is 5. The returned object is a tensor with a length 10.
$encode('<mask>') tokenizer
[1] 0 50264 2
$encode(' the') tokenizer
[1] 0 5 2
I will locate the position of the mask token in the input tensor, because I will need it later.
<- which(input$tolist()[[1]] == tokenizer$mask_token_id)
loc loc
[1] 6
# <mask> is the 6th token in my input tensor
I will submit the input tensor to the model.
<- model(input)$logits$detach()
token_logits token_logits
tensor([[[33.1248, -4.0028, 18.4557, ..., 2.9945, 5.8644, 11.4540],
[ 8.9463, -3.9279, 20.4999, ..., 2.2690, 2.1061, 4.4710],
[ 9.1800, -3.5469, 8.1455, ..., 1.8752, 3.0497, 3.4087],
...,
[ 5.8352, -3.7452, 6.6445, ..., 0.6672, 0.3624, 2.9578],
[18.0251, -4.6203, 19.6356, ..., 0.8772, 3.5524, 7.1917],
[12.1654, -3.9563, 31.6049, ..., 1.1393, -0.8462, 9.7371]]])
$shape token_logits
torch.Size([1, 10, 50265])
The output returns logits for all 50265 words in the roberta-base
dictionary for each token position (there are ten tokens in my sentence). These logits represent the probability of each word in the dictionary for that specific token position. In this case, my only interest is the
# Note that python indices start from 0
# So, we ask for loc-1 below
<- token_logits[0][loc-1]
masked_token_logits masked_token_logits
tensor([-3.5709, -3.8466, 3.3165, ..., -5.0920, -4.6620, -1.2226])
$shape masked_token_logits
torch.Size([50265])
# Find the top three words based on probability calculated from the model
<- torch$topk(masked_token_logits, k = 3L)
top_3 top_3
torch.return_types.topk(
values=tensor([24.5527, 20.6755, 19.7865]),
indices=tensor([ 812, 1867, 1312]))
# Decode these indices to find the corresponding words
# from the roberta-base dictionary
<- tokenizer$decode(token_ids = top_3['indices'])
unmask unmask
[1] " capital Capital center"
The model says the three words that would most likely be appropriate for the missing word is capital, Capital, and center. Below is some formatting to get the sentence with each of these three words.
<- strsplit(unmask,' ')[[1]][-1]
unmask_
gsub(pattern = '<mask>', replacement = unmask_[1], x = txt)
[1] "Istanbul is the capital of Turkey."
gsub(pattern = '<mask>', replacement = unmask_[2], x = txt)
[1] "Istanbul is the Capital of Turkey."
gsub(pattern = '<mask>', replacement = unmask_[3], x = txt)
[1] "Istanbul is the center of Turkey."
We can use a different NLP model, and the code would be almost identical. The only difference would be the mask token. For instance, if we use the bert-base-uncased
model, the mask token is defined as [MASK]. The code below runs the same task with the bert-base-uncased
model. As you will see, the three words this model predicted for the missing piece are capital, heart, birthplace.
<- import('transformers')
transformers <- import('torch')
torch
<- transformers$AutoTokenizer$from_pretrained('bert-base-uncased')
tokenizer
<- transformers$AutoModelForMaskedLM$from_pretrained('bert-base-uncased')
model
$mask_token tokenizer
[1] "[MASK]"
$mask_token_id tokenizer
[1] 103
<- 'Istanbul is the [MASK] of Turkey.'
txt
<- tokenizer$encode(text = txt,return_tensors="pt")
input input
tensor([[ 101, 9960, 2003, 1996, 103, 1997, 4977, 1012, 102]])
<- which(input$tolist()[[1]] == tokenizer$mask_token_id)
loc
<- model(input)$logits
token_logits
<- token_logits[0][loc-1]
masked_token_logits
<- torch$topk(masked_token_logits, k = as.integer(3))
top_3
<- tokenizer$decode(token_ids = top_3['indices'])
unmask
<- strsplit(unmask,' ')[[1]]
unmask_
gsub(pattern = '\\[MASK]', replacement = unmask_[1], x = txt)
[1] "Istanbul is the capital of Turkey."
gsub(pattern = '\\[MASK]', replacement = unmask_[2], x = txt)
[1] "Istanbul is the heart of Turkey."
gsub(pattern = '\\[MASK]', replacement = unmask_[3], x = txt)
[1] "Istanbul is the birthplace of Turkey."
Resources
A given text can be classified in many different ways. The most popular one is sentiment analysis predicting whether a text carries a negative, neutral, or positive sentiment. Or, we can try to predict the emotion (sadness, joy, love, anger, fear, surprise). We can also try to predict whether or not a given text is toxic. The subset of models on Hugging Face for text classification offers a variety of analyses. I will reproduce the examples from three models predicting slightly different things for a given text.
The first example is the most downloaded model in this category, cardiffnlp/twitter-roberta-base-sentiment
. Let’s import the modules we will need, and load the tokenizer
and model
objects for this specific model.
<- import('transformers')
transformers
<- transformers$AutoTokenizer$from_pretrained('cardiffnlp/twitter-roberta-base-sentiment')
tokenizer
<- transformers$AutoModelForSequenceClassification$from_pretrained('cardiffnlp/twitter-roberta-base-sentiment') model
The input text is plain text as a character string, and we tokenize it using the tokenizer
object. Note that I tried to trick the model by putting some negative words in the text; however, the sentence’s sentiment was positive.
<- "Dr. Z's class was not very boring and not disorganized. I would definitely take it again."
txt
<- tokenizer$encode(text = txt,return_tensors="pt")
input input
tensor([[ 0, 14043, 4, 525, 18, 1380, 21, 45, 182, 15305,
8, 45, 2982, 29835, 4, 38, 74, 2299, 185, 24,
456, 4, 2]])
$shape input
torch.Size([1, 23])
After tokenization, we submit the input tensor to the model object, producing logits.
<- model(input)$logits
output output
tensor([[-2.1241, -0.1090, 2.6949]], grad_fn=<AddmmBackward0>)
$shape output
torch.Size([1, 3])
The final output is a tensor that includes three numbers. These numbers are related with three categories: Negative (the first element), Neutral (the second element), and Positive (the third element). To transform the logits to probabilities, we apply a softmax transformation.
<- output$detach()$numpy()
scores scores
[,1] [,2] [,3]
[1,] -2.124 -0.109 2.695
# Softmax transformation to get probabilities for Negative, Neutral, and Positive
exp(scores)/sum(exp(scores))
[,1] [,2] [,3]
[1,] 0.007556 0.05668 0.9358
# the first probability is for Negative
# the second category is Neutral
# the third category is Positive
The model predicts that the probability of this text having a positive sentiment is 0.936, having a negative sentiment is 0.057, and having a neutral sentiment is 0.008. (Nice job!)
Resources:
The second example is another model in the Text Classification category to specifically developed for detecting emotions such as sadness, joy, love, anger, fear, and surprise, bhadresh-savani/distilbert-base-uncased-emotion
.
<- import('transformers')
transformers
<- transformers$AutoTokenizer$from_pretrained('bhadresh-savani/distilbert-base-uncased-emotion')
tokenizer
<- transformers$AutoModelForSequenceClassification$from_pretrained('bhadresh-savani/distilbert-base-uncased-emotion') model
I will use the same input text, tokenize it using the tokenizer
object, and then submit the input tensor to the model to obtain the logits.
<- "Dr. Z's class was not very boring and not disorganized. I would definitely take it again."
txt
<- tokenizer$encode(text = txt,return_tensors="pt")
input
<- model(input)$logits
output output
tensor([[ 6.2330, -0.8139, -2.2702, 0.0108, -2.2436, -2.2626]],
grad_fn=<AddmmBackward0>)
$shape output
torch.Size([1, 6])
The output returns six numbers related to six categories. We can check the order of categories to match these numbers to these categories.
$config$id2label model
$`0`
[1] "sadness"
$`1`
[1] "joy"
$`2`
[1] "love"
$`3`
[1] "anger"
$`4`
[1] "fear"
$`5`
[1] "surprise"
Let’s apply the softmax transformation to obtain the probabilities for each category.
<- output$detach()$numpy()
scores
<- exp(scores)/sum(exp(scores))
prob
data.frame(class = unlist(model$config$id2label),
prob = as.numeric(prob))
class prob
0 sadness 0.9965414
1 joy 0.0008671
2 love 0.0002021
3 anger 0.0019782
4 fear 0.0002076
5 surprise 0.0002037
The model predicted emotion for this text is sadness with a probability estimate of 0.996 (!!!)
Resources:
The final example in this category is the Detoxify module. This Python module has trained model to predict toxic comments based on the datasets from three Jigsaw challenges on Kaggle: Toxic comment classification, Unintended Bias in Toxic comments, Multilingual toxic comment classification.
For a given sentence, the model returns a probability estimate for six different areas: toxic, severe toxic, obscene, threat, insult, and identity_hate. Let’s take the following text, and get predictions from this model.
Immigrants are stealing our jobs. Send them back to where they come from! THEY DON’T DESERVE TO LIVE IN AMERICA!
# Load the Python module
<- import('detoxify')
detoxify
# Input text
<- "Immigrants are stealing our jobs. Send them back to where they come from! THEY DON'T DESERVE TO LIVE IN AMERICA!"
txt
# Predict
<- detoxify$Detoxify('original')$predict(txt)
pred unlist(pred)
toxicity severe_toxicity obscene threat
0.832002 0.009733 0.024375 0.018156
insult identity_attack
0.091395 0.403692
Note that the text input can be a vector of texts. For instance, we can take 5 text strings as a vector and return the probability estimates in these six areas as a 5 x 6 matrix.
<- c("No, he is an arrogant, self serving, immature idiot. Get it right.",
txt "Simple. You are stupid!",
"The overall organization and text are good.",
"Who is the man in the high castle?",
"This is worse than I thought. This user has a sockpuppet account!!")
<- detoxify$Detoxify('original')$predict(txt)
pred
unlist(pred)
toxicity1 toxicity2 toxicity3 toxicity4
0.98233998 0.98080528 0.00055082 0.00116317
toxicity5 severe_toxicity1 severe_toxicity2 severe_toxicity3
0.31779119 0.02491754 0.02204690 0.00014005
severe_toxicity4 severe_toxicity5 obscene1 obscene2
0.00009857 0.00031965 0.68105346 0.62617028
obscene3 obscene4 obscene5 threat1
0.00020516 0.00016786 0.00766684 0.00100577
threat2 threat3 threat4 threat5
0.00104363 0.00014156 0.00010783 0.00043173
insult1 insult2 insult3 insult4
0.92518520 0.93162781 0.00018154 0.00018590
insult5 identity_attack1 identity_attack2 identity_attack3
0.01330768 0.01747648 0.00959109 0.00014455
identity_attack4 identity_attack5
0.00014807 0.00074668
# Some reorganization of the output
<- matrix(unlist(pred),
probs nrow = 5,
ncol=6,
byrow=FALSE)
data.frame(txt = txt,
toxicity = round(probs[,1],2),
severe_toxicity = round(probs[,2],2),
obscene = round(probs[,3],2),
threat = round(probs[,4],2),
insult = round(probs[,5],2),
identity_attach = round(probs[,6],2))
txt
1 No, he is an arrogant, self serving, immature idiot. Get it right.
2 Simple. You are stupid!
3 The overall organization and text are good.
4 Who is the man in the high castle?
5 This is worse than I thought. This user has a sockpuppet account!!
toxicity severe_toxicity obscene threat insult identity_attach
1 0.98 0.02 0.68 0 0.93 0.02
2 0.98 0.02 0.63 0 0.93 0.01
3 0.00 0.00 0.00 0 0.00 0.00
4 0.00 0.00 0.00 0 0.00 0.00
5 0.32 0.00 0.01 0 0.01 0.00
Note that the probabilities within a row do not necessarily add up to one. If I am not mistaken, I think the model has sub models that make an independent binary prediction for each category.
Extractive Question Answering is the task of extracting an answer from a text given a question. We provide two inputs as text strings. The first input is a question. The second input is context. The model extracts the answer for the question from a given text.
For instance, let’s say we have the following text as a context.
In probability theory, a normal (or Gaussian or Gauss or Laplace–Gauss) distribution is a type of continuous probability distribution for a real-valued random variable. The parameter mu is the mean or expectation of the distribution (and also its median and mode), while the parameter sigma is its standard deviation. A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate. Normal distributions are important in statistics and are often used in the natural and social sciences to represent real-valued random variables whose distributions are not known. Their importance is partly due to the central limit theorem. It states that, under some conditions, the average of many samples (observations) of a random variable with finite mean and variance is itself a random variable—whose distribution converges to a normal distribution as the number of samples increases. Therefore, physical quantities that are expected to be the sum of many independent processes, such as measurement errors, often have distributions that are nearly normal.
Then, we ask the following question.
What is the parameter mu in a normal distribution?
The model should find the relevant part in the text and extract the text that responds to this question. See below the code for this example using the most popular model in this category, deepset/roberta-base-squad2
# Load the python libraries
<- import('transformers')
transformers <- import('torchvision')
torchvision <- import('torch')
torch
# Load the tokenizer and model for roberta-base-squad2
<- transformers$AutoTokenizer$from_pretrained('deepset/roberta-base-squad2')
tokenizer
<- transformers$AutoModelForQuestionAnswering$from_pretrained('deepset/roberta-base-squad2')
model
# Text inputs for the question and context
<- "What is the parameter mu in a normal distribution?"
question
# Copy/paste the same text from context as written above
# For some reason RMarkdown doesn't display it in the code below when I knit
# the document
<- "In probability theory, a normal (or Gaussian or Gauss or Laplace–Gauss) distribution is a type of continuous probability distribution for a real-valued random variable. The parameter mu is the mean or expectation of the distribution (and also its median and mode), while the parameter sigma is its standard deviation. A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate. Normal distributions are important in statistics and are often used in the natural and social sciences to represent real-valued random variables whose distributions are not known. Their importance is partly due to the central limit theorem. It states that, under some conditions, the average of many samples (observations) of a random variable with finite mean and variance is itself a random variable—whose distribution converges to a normal distribution as the number of samples increases. Therefore, physical quantities that are expected to be the sum of many independent processes, such as measurement errors, often have distributions that are nearly normal."
context
# Tokenize the inputs
<- tokenizer$encode(text = question,
input text_pair = context,
return_tensors="pt")
$shape input
torch.Size([1, 219])
# there are 219 tokens in the context
# Submit the input tensor to the model
<- model(input)
output
# the model returns two elements
# the first element includes the logits for the starting position of the answer
# output$start_logits
# the second element includes the logits for the ending position of the answer
# Extract the most likely token position to start the respond
<- output$start_logits$argmax(-1L)$item()
start start
[1] 53
# Extract the most likely token position to end the respond
<- output$end_logits$argmax(-1L)$item()
end end
[1] 59
# Decode the words between starting position and ending position
# This is the response the model predicts for the given question
$decode(input[0][start:end]) tokenizer
[1] " the mean or expectation of the distribution"
Resources:
In Summarization, the model takes a longer text and generates a shorter text as a summary.
For instance, let’s say we have the following text.
New York (CNN)When Liana Barrientos was 23 years old, she got married in Westchester County, New York. A year later, she got married again in Westchester County, but to a different man and without divorcing her first husband. Only 18 days after that marriage, she got hitched yet again. Then, Barrientos declared ‘I do’ five more times, sometimes only within two weeks of each other. In 2010, she married once more, this time in the Bronx. In an application for a marriage license, she stated it was her ‘first and only’ marriage. Barrientos, now 39, is facing two criminal counts of ‘offering a false instrument for filing in the first degree,’ referring to her false statements on the 2010 marriage license application, according to court documents. Prosecutors said the marriages were part of an immigration scam. On Friday, she pleaded not guilty at State Supreme Court in the Bronx, according to her attorney, Christopher Wright, who declined to comment further. After leaving court, Barrientos was arrested and charged with theft of service and criminal trespass for allegedly sneaking into the New York subway through an emergency exit, said Detective Annette Markowski, a police spokeswoman. In total, Barrientos has been married 10 times, with nine of her marriages occurring between 1999 and 2002. All occurred either in Westchester County, Long Island, New Jersey or the Bronx. She is believed to still be married to four men, and at one time, she was married to eight men at once, prosecutors say. Prosecutors said the immigration scam involved some of her husbands, who filed for permanent residence status shortly after the marriages. Any divorces happened only after such filings were approved. It was unclear whether any of the men will be prosecuted. The case was referred to the Bronx District Attorney’s Office by Immigration and Customs Enforcement and the Department of Homeland Security’s Investigation Division. Seven of the men are from so-called ‘red-flagged’ countries, including Egypt, Turkey, Georgia, Pakistan and Mali. Her eighth husband, Rashid Rajput, was deported in 2006 to his native Pakistan after an investigation by the Joint Terrorism Task Force. If convicted, Barrientos faces up to four years in prison. Her next court appearance is scheduled for May 18.
The code below generates a model predicted summary for this text using the most downloaded model in this category, facebook/bart-large-cnn.
<- import('transformers')
transformers
<- transformers$AutoTokenizer$from_pretrained('facebook/bart-large-cnn')
tokenizer
<- transformers$AutoModelForSeq2SeqLM$from_pretrained('facebook/bart-large-cnn')
model
# Copy/paste the same text above using double quotes
# For some reason RMarkdown doesn't display it in the code below when I knit
# the document
<- "New York (CNN)When Liana Barrientos was 23 years old, she got married in Westchester County, New York. A year later, she got married again in Westchester County, but to a different man and without divorcing her first husband. Only 18 days after that marriage, she got hitched yet again. Then, Barrientos declared 'I do' five more times, sometimes only within two weeks of each other. In 2010, she married once more, this time in the Bronx. In an application for a marriage license, she stated it was her 'first and only' marriage. Barrientos, now 39, is facing two criminal counts of 'offering a false instrument for filing in the first degree,' referring to her false statements on the 2010 marriage license application, according to court documents. Prosecutors said the marriages were part of an immigration scam. On Friday, she pleaded not guilty at State Supreme Court in the Bronx, according to her attorney, Christopher Wright, who declined to comment further. After leaving court, Barrientos was arrested and charged with theft of service and criminal trespass for allegedly sneaking into the New York subway through an emergency exit, said Detective Annette Markowski, a police spokeswoman. In total, Barrientos has been married 10 times, with nine of her marriages occurring between 1999 and 2002. All occurred either in Westchester County, Long Island, New Jersey or the Bronx. She is believed to still be married to four men, and at one time, she was married to eight men at once, prosecutors say. Prosecutors said the immigration scam involved some of her husbands, who filed for permanent residence status shortly after the marriages. Any divorces happened only after such filings were approved. It was unclear whether any of the men will be prosecuted. The case was referred to the Bronx District Attorney's Office by Immigration and Customs Enforcement and the Department of Homeland Security's Investigation Division. Seven of the men are from so-called 'red-flagged' countries, including Egypt, Turkey, Georgia, Pakistan and Mali. Her eighth husband, Rashid Rajput, was deported in 2006 to his native Pakistan after an investigation by the Joint Terrorism Task Force. If convicted, Barrientos faces up to four years in prison. Her next court appearance is scheduled for May 18."
txt
# Tokenize the input text
<- tokenizer$encode(text = txt,
input return_tensors="pt")
# Generate the predicted tokens for the summary text
<- model$generate(input)
output
# Decode the output tokens
$batch_decode(output) tokenizer
[1] "</s><s>Liana Barrientos has been married 10 times, with nine of her marriages occurring between 1999 and 2002. She is believed to still be married to four men, and at one time, she was married to eight men at once. Her eighth husband, Rashid Rajput, was deported in 2006 to his native Pakistan after an investigation.</s>"
# Too long to print, so I wrap it in kable
<- as.matrix(tokenizer$batch_decode(output))
summary_txt
require(kableExtra)
%>%
summary_txt kbl() %>%
kable_styling()
</s><s>Liana Barrientos has been married 10 times, with nine of her marriages occurring between 1999 and 2002. She is believed to still be married to four men, and at one time, she was married to eight men at once. Her eighth husband, Rashid Rajput, was deported in 2006 to his native Pakistan after an investigation.</s> |
Resources:
The purpose of the text generation task is to create meaningful continuation of a text. For instance, suppose I start with the following sentence.
My name is Cengiz and I am from Turkey.
How would an NLP model continue this sentence? Below is the code to generate a continuation for this sentence using the GPT2 model.
# Load the module
<- import('transformers')
transformers
# Load the tokenizer and model
<- transformers$AutoTokenizer$from_pretrained('gpt2')
tokenizer
<- transformers$GPT2LMHeadModel$from_pretrained('gpt2')
model
# Input text
<- 'My name is Cengiz and I am from Turkey.'
txt
# Tokenize the input
<- tokenizer$encode(txt,return_tensors='pt')
input input
tensor([[3666, 1438, 318, 327, 1516, 528, 290, 314, 716, 422, 7137, 13]])
# Generate the continuation text
# you can play with arguments like
# max_length, num_beams, no_repeat_ngram_size,num_return_sequences, etc.
# I don't know what some of these mean and how they impact the generated text
# 100L, is just 100 as integer
# If you use only 100, it is numeric (double) and the function gives error
# because it expects integers
<- model$generate(input,
output max_length = 100L,
num_beams = 20L,
no_repeat_ngram_size = 2L,
num_return_sequences = 1L,
early_stopping =TRUE)
# Too long to print, so I wrap it in kable
<- as.matrix(tokenizer$batch_decode(output))
new_txt
%>%
new_txt kbl() %>%
kable_styling()
My name is Cengiz and I am from Turkey. I was born and raised in Turkey and have been living in the United States for over 20 years. I have always been interested in learning about the world and what it is like to live in a country where you are not allowed to speak your mind. It is very difficult for me to understand what is going on in this country, but I do know that I have a lot to learn and that is what I want to do. |
Note: I didn’t write this. This is what GPT2 wrote, and it is hilarious!
Resources:
Like Text Classification, there are different tasks that can be considered under the Text2Text Generation category. I will reproduce examples for Generating a Headline, Generating a Question with and without Supervision, and Paraphrasing.
The purpose of this task is to generate a headline for a given text. For instance, let’s consider the following text.
Very early yesterday morning, the United States President Donald Trump reported he and his wife First Lady Melania Trump tested positive for COVID-19. Officials said the Trumps’ 14-year-old son Barron tested negative as did First Family and Senior Advisors Jared Kushner and Ivanka Trump. Trump took to social media, posting at 12:54 am local time (0454 UTC) on Twitter, ‘Tonight, [Melania] and I tested positive for COVID-19. We will begin our quarantine and recovery process immediately. We will get through this TOGETHER!’ Yesterday afternoon Marine One landed on the White House’s South Lawn flying Trump to Walter Reed National Military Medical Center (WRNMMC) in Bethesda, Maryland. Reports said both were showing ‘mild symptoms’. Senior administration officials were tested as people were informed of the positive test. Senior advisor Hope Hicks had tested positive on Thursday. Presidential physician Sean Conley issued a statement saying Trump has been given zinc, vitamin D, Pepcid and a daily Aspirin. Conley also gave a single dose of the experimental polyclonal antibodies drug from Regeneron Pharmaceuticals. According to official statements, Trump, now operating from the WRNMMC, is to continue performing his duties as president during a 14-day quarantine. In the event of Trump becoming incapacitated, Vice President Mike Pence could take over the duties of president via the 25th Amendment of the US Constitution. The Pence family all tested negative as of yesterday and there were no changes regarding Pence’s campaign events.
Below is the code generating a headline for this text using the T5 language model fine tuned for this task.
Note that the input text string should start with headline:
. For instance, the input string for the text above should be formatted as
headline: Very early yesterday morning, the United States President Donald Trump reported he and his wife First Lady … |
# Load the modules
<- import('transformers')
transformers <- import('torch')
torch
# Load the tokenizer and model
<- transformers$AutoTokenizer$from_pretrained('Michau/t5-base-en-generate-headline')
tokenizer <- transformers$AutoModelForSeq2SeqLM$from_pretrained('Michau/t5-base-en-generate-headline')
model
# Input text string
# Copy/paste the article above
# Do not forget to put headline: at the beginning
<- "headline: Very early yesterday morning, the United States President Donald Trump reported he and his wife First Lady Melania Trump tested positive for COVID-19. Officials said the Trumps' 14-year-old son Barron tested negative as did First Family and Senior Advisors Jared Kushner and Ivanka Trump. Trump took to social media, posting at 12:54 am local time (0454 UTC) on Twitter, 'Tonight, [Melania] and I tested positive for COVID-19. We will begin our quarantine and recovery process immediately. We will get through this TOGETHER!' Yesterday afternoon Marine One landed on the White House's South Lawn flying Trump to Walter Reed National Military Medical Center (WRNMMC) in Bethesda, Maryland. Reports said both were showing 'mild symptoms'. Senior administration officials were tested as people were informed of the positive test. Senior advisor Hope Hicks had tested positive on Thursday.
article Presidential physician Sean Conley issued a statement saying Trump has been given zinc, vitamin D, Pepcid and a daily Aspirin. Conley also gave a single dose of the experimental polyclonal antibodies drug from Regeneron Pharmaceuticals. According to official statements, Trump, now operating from the WRNMMC, is to continue performing his duties as president during a 14-day quarantine. In the event of Trump becoming incapacitated, Vice President Mike Pence could take over the duties of president via the 25th Amendment of the US Constitution. The Pence family all tested negative as of yesterday and there were no changes regarding Pence's campaign events."
# Tokenize the input string
<- tokenizer$encode_plus(text = article,
input return_tensors = 'pt')
<- input['input_ids']
input_ids
# Model predicted tokens for the headline
<- model$generate(input_ids = input_ids)
output output
tensor([[ 0, 2523, 11, 1485, 8571, 5049, 11219, 2300, 24972, 21,
2847, 7765, 308, 4481, 1]])
# Decode the model predicted tokens
$batch_decode(output) tokenizer
[1] "<pad> Trump and First Lady Melania Test Positive for COVID-19</s>"
Resources:
There are model that can generate a question given a text string for an answer and another text string for context. Below is an example using the T5 language model fine tuned for this task.
# Load the module
<- import('transformers')
transformers
# Load the tokenizer and model
<- transformers$AutoTokenizer$from_pretrained('mrm8488/t5-base-finetuned-question-generation-ap')
tokenizer
<- transformers$AutoModelWithLMHead$from_pretrained('mrm8488/t5-base-finetuned-question-generation-ap')
model
# Input string should be formatted as below
# 'answer: ..... context: .....'
<- 'answer: 12 context: Apples'
txt
# Tokenize the input and generate the question
<- tokenizer$encode(text = txt,return_tensors = 'pt')
input <- model$generate(input)
output $batch_decode(output) tokenizer
[1] "<pad> question: How many apples are there?</s>"
# Another question with a different answer
<- 'answer: red context: Apples'
txt
<- tokenizer$encode(text = txt,return_tensors = 'pt')
input
<- model$generate(input)
output
$batch_decode(output) tokenizer
[1] "<pad> question: What color are apples?</s>"
# Another one
<- 'answer: decay context: Apples'
txt
<- tokenizer$encode(text = txt,return_tensors = 'pt')
input
<- model$generate(input)
output
$batch_decode(output) tokenizer
[1] "<pad> question: What do apples do?</s>"
Resources
This is similar to the previous task, but there is no supervision. In other words, we don’t provide an answer, and only provide a context. The model generates questions from this context.
<- import('transformers')
transformers
<- transformers$AutoTokenizer$from_pretrained('valhalla/t5-base-e2e-qg')
tokenizer
<- transformers$AutoModelForSeq2SeqLM$from_pretrained('valhalla/t5-base-e2e-qg')
model
<- transformers$AutoModelWithLMHead$from_pretrained('valhalla/t5-base-e2e-qg')
model
= "I had twelwe apples. I ate two apples. Then, I gave five apples to my daughter. I didn't give any apple to my son."
txt
<- tokenizer$encode(text = txt,
input return_tensors = 'pt')
<- model$generate(input,
output max_length = 50L)
<- tokenizer$batch_decode(output)
qs
%>%
qs as.matrix() %>%
kbl() %>%
kable_styling()
<pad> How many apples did I have?<sep> How many apples did I eat?<sep> How many apples did I give to my daughter?<sep> How many apples did I give to my son?<sep></s> |
Resources:
For a given sentence, these models paraphrases the sentence.
<- import('transformers')
transformers <- import('torch')
torch
<- transformers$AutoTokenizer$from_pretrained('tuner007/pegasus_paraphrase')
tokenizer <- transformers$PegasusForConditionalGeneration$from_pretrained('tuner007/pegasus_paraphrase')
model
<- "Her life spanned years of incredible change for women as they gained more rights than ever before."
txt1 <- "Giraffes like Acacia leaves and hay, and they can consume 75 pounds of food a day."
txt2
# Paraphrase the first text
<- tokenizer$encode(text = txt1, return_tensors = 'pt')
input
<- model$generate(input)
output
$batch_decode(output) tokenizer
[1] "<pad> Her life was filled with change for women as they gained more rights than ever before.</s>"
# Paraphrase the second text
<- tokenizer$encode(text = txt2, return_tensors = 'pt')
input
<- model$generate(input)
output
$batch_decode(output) tokenizer
[1] "<pad> Giraffes can eat 75 pounds of food a day.</s>"
Resources:
There are models fine tuned for the task of translating a text from one language to another. Check the following page for a number of pairs of translation pairs.
https://huggingface.co/Helsinki-NLP
I provide two examples. The first one is translating from Turkish to English, and the second one is translating from English to Spanish.
<- import('transformers')
transformers
<- transformers$AutoTokenizer$from_pretrained('Helsinki-NLP/opus-mt-tr-en')
tokenizer <- transformers$AutoModelForSeq2SeqLM$from_pretrained('Helsinki-NLP/opus-mt-tr-en')
model
<- "Merhaba, ben Cengiz. Istanbul'da dogdum."
txt
<- tokenizer$encode(txt,return_tensors='pt')
input
<- model$generate(input)
output
$batch_decode(output) tokenizer
[1] "<pad> Hi, I'm Genghis, born in Istanbul."
Resources:
<- import('transformers')
transformers
<- transformers$AutoTokenizer$from_pretrained('Helsinki-NLP/opus-mt-en-es')
tokenizer <- transformers$AutoModelForSeq2SeqLM$from_pretrained('Helsinki-NLP/opus-mt-en-es')
model
<- "Hi, my name is Cengiz and I live in Eugene."
txt
<- tokenizer$encode(txt,return_tensors='pt')
input
<- model$generate(input)
output
$batch_decode(output) tokenizer
[1] "<pad> Hola, mi nombre es Cengiz y vivo en Eugene."
Resources:
In Zero-shot classification, we provide a text string as an input and then provide a class label. This class label could be anything. Then, the model predicts a probability of the given text to belong to this class label.
For instance, let’s consider the following sentence.
I will see the world one day.
We can say that this sentence is related to for instance traveling or exploration but for instance not related to cooking. We can use a model to predict each hypothesis.
# Load the modules, tokenizer, and model
<- import('transformers')
transformers
<- transformers$AutoTokenizer$from_pretrained('facebook/bart-large-mnli')
tokenizer
<- transformers$AutoModelForSequenceClassification$from_pretrained('facebook/bart-large-mnli')
model
# Input text
= "I will see the world one day."
txt
# Is this text related to cooking?
<- tokenizer$encode(text = txt,
input text_pair = 'cooking',
return_tensors = 'pt',
truncation = TRUE)
<- model(input)
output
<- output$logits$detach()$tolist()[[1]]
scores
<- exp(scores)/sum(exp(scores))
prob
prob
[1] 0.906399 0.087287 0.006314
# the first element is FALSE, not related, prob = 0.906
# the second element is Neutral, prob = 0.087
# the second element is TRUE, related, prob = 0.006
# Is this text related to travel?
<- tokenizer$encode(text = txt,
input text_pair = 'travel',
return_tensors = 'pt',
truncation = TRUE)
<- model(input)
output
<- output$logits$detach()$tolist()[[1]]
scores
<- exp(scores)/sum(exp(scores))
prob
prob
[1] 0.001291 0.228867 0.769842
# the first element is FALSE, not related, prob = 0.001
# the second element is Neutral, prob = 0.229
# the second element is TRUE, related, prob = 0.770
Resources:
The purpose of this task is to compare two text and produce a score of similarity.
For instance, consider the following three sentences:
Today is a sunny day.
That is a happy person.
Weather is really nice.
We can say that Sentence 1 and 3 are more similar to each other as they are related to weather.
Let’s compute the similarity scores for these three sentences.
<- import('transformers')
transformers <- import('torch')
torch
<- 'Today is a sunny day'
txt1 <- 'That is a happy person'
txt2 <- 'Weather is really nice'
txt3
<- transformers$AutoTokenizer$from_pretrained('sentence-transformers/multi-qa-MiniLM-L6-cos-v1')
tokenizer <- transformers$AutoModel$from_pretrained('sentence-transformers/multi-qa-MiniLM-L6-cos-v1')
model
# Create a function to create sentence embeddings for a given text.
# This function takes a text and generates a numeric representation of this text in
# 384 dimensions. The input is a plain txt and the output is the sentence embeddings
# as a vector of 384 numbers
<- function(txt){
encode_
<- tokenizer(txt,padding=TRUE,truncation=TRUE,return_tensors = 'pt')
input <- model(input['input_ids'],return_dict=TRUE)
output <- output$last_hidden_state$detach()
embeddings <- input['attention_mask']$unsqueeze(as.integer(-1))$expand(embeddings$size())$float()
input_mask_expanded
<- torch$sum(torch$multiply(embeddings,input_mask_expanded),as.integer(1))
num <- torch$clamp(input_mask_expanded$sum(as.integer(1)),min = 1e-9)
den
<- torch$nn$functional$normalize(torch$divide(num,den),
emb p = as.integer(2),
dim = as.integer(1))
emb
}
# Embeddings for the texts
<- encode_(txt1)
emb1 $shape emb1
torch.Size([1, 384])
<- encode_(txt2)
emb2 $shape emb2
torch.Size([1, 384])
<- encode_(txt3)
emb3 $shape emb3
torch.Size([1, 384])
# Compute the similarity between Sentence 1 and Sentence 2
$mm(emb1,
torch$transpose(0L,1L))$tolist()[[1]] emb2
[1] 0.2588
# Compute the similarity between Sentence 1 and Sentence 3
$mm(emb1,
torch$transpose(0L,1L))$tolist()[[1]] emb3
[1] 0.5501
The model predicted similarity score is 0.259 between Sentence 1 and Sentence 2 while the model predicted similarity score is 0.550 between Sentence 1 and Sentence 3. So, the model predicts that Sentence 1 and Sentence 3 are more similar to each other.
Resources:
Speech recognition task takes an audio file and transcribes the audio to text. To produce an example of this task, I will use this audio file. You will need to download this file to your local drive.
The following code generates the speech based on this audio file.
# Load the Python Modules
<- import('torch')
torch <- import('librosa')
librosa <- import('requests')
requests <- import('transformers')
transformers
# Load the Wav2Vec2 tokenizer and the model
<- transformers$Wav2Vec2Tokenizer$from_pretrained('facebook/wav2vec2-base-960h')
tokenizer
<- transformers$Wav2Vec2ForCTC$from_pretrained('facebook/wav2vec2-base-960h')
model
# Load the audio file from local drive
<- librosa$load(here('_posts/huggingface/Welcome.WAV'),
sound.file sr = 16000)
# Tokenize the input audio file
= tokenizer(sound.file[1],
input_values return_tensors = "pt")
# Model prediction
= model(input_values['input_values'])$logits
logits
= torch$argmax(logits, dim = -1L)
prediction
# Decoding the prediction
= tokenizer$batch_decode(prediction)[[1]]
transcription
%>%
transcription as.matrix() %>%
kbl() %>%
kable_styling()
THANK YOU FOR CHOUSING THE OLYMPUS DICTATION MANAGEMENT SYSTEM THE OLYMPUS DICTATION MANAGEMENT SYSTEM GIVES YOU THE POWER TO MANAGE YOUR DICTATIONS TRANSCRIPTIONS AND DOCUMENTS SEEMLESSLY AND TO IMPROVE THE PRODUCTIVITY OF YOUR DAILY WORK FOR EXAMPLE YOU CAN AUTOMATICALLY SENT THE DICTATION FILES OR TRANSCRIBED DOCUMENTS TO YOUR ASSISTANT OR THE AUTHOR VIRE EMALE OR F T P IF YOURE USING THE SPEECH RECOGNITION SOFTWARE THE SPEECH RECOGNITION ENGINE WORKS IN THE BACKGROUND TO SUPPORT YOUR DOCUMENT CREATION WE HOPE YOU ENJOY THE SIMPLE FLEXIBLE RELIABLE AND SECURE SOLUTIONS FROM OLYMPUS |
Resources:
Similar to Text Classification, you can use models to classify audio. Hubert-Large for Emotion Recognition classify the audio using four different emotions: happy, angry, sad, and neutral.
The code below use the same audio file and predicts emotion.
<- import('torch')
torch <- import('librosa')
librosa <- import('transformers')
transformers
<- transformers$Wav2Vec2FeatureExtractor$from_pretrained('superb/hubert-large-superb-er')
tokenizer <- transformers$HubertForSequenceClassification$from_pretrained('superb/hubert-large-superb-er')
model
<- librosa$load(here('_posts/huggingface/Welcome.WAV'),
sound.file sr = 16000)
= tokenizer(sound.file[1],
input_values sampling_rate = 16000,
padding = TRUE,
return_tensors = "pt")
= model(input_values['input_values'])$logits
logits <- logits$detach()$tolist()[[1]]
logits logits
[1] -0.06883 0.21034 0.07505 -2.43493
<- exp(logits)/sum(exp(logits))
probs
# Labels
data.frame(labels = unlist(model$config$id2label),
probs = probs)
labels probs
0 neu 0.28006
1 hap 0.37025
2 ang 0.32340
3 sad 0.02628
Resources:
The prupose of the Image Classification task is to predict a label for the objects in an image. For this task, I will use the Google’s Vision Transformers model, the most popular model in this category. This model considers 1000 different class labels, and you can find all these labels in the table below. Given an image file, the model predicts a probability for each of the following labels.
For the demonstration, I will use this image at this link.
# Load the Python modules
<- import('transformers')
transformers <- import('PIL')
pil <- import('requests')
requests <- import('torch')
torch
# Read the image file from url
<- 'http://images.cocodataset.org/val2017/000000039769.jpg'
url
<- pil$Image$open(requests$get(url, stream=T)$raw)
image
# Load the feature extractor and model
= transformers$ViTFeatureExtractor$from_pretrained('google/vit-base-patch16-224')
feature_extractor
= transformers$ViTForImageClassification$from_pretrained('google/vit-base-patch16-224')
model
# Extract the features from the given image
<- feature_extractor(images=image,return_tensors='pt')
inputs 'pixel_values']$shape inputs[
torch.Size([1, 3, 224, 224])
# Model predictions
= model(inputs['pixel_values'])
outputs = outputs$logits
logits
# Softmax transformation of the logits to probabilities
<- as.numeric(logits$detach()$numpy())
logits_
<- exp(logits_)/sum(exp(logits_))
probs
<- as.matrix(unlist(model$config$id2label))
labels
# Find the top 5 predicted categories
<- order(probs,decreasing=T)[1:5]
locs
data.frame(class = labels[locs],
prob = probs[locs])
class prob
1 Egyptian cat 0.9374417
2 tabby, tabby cat 0.0384426
3 tiger cat 0.0144114
4 lynx, catamount 0.0032743
5 Siamese cat, Siamese 0.0006796
Resources:
The purpose of the Object Detection task is to identify different objects in an image and provide the location of these objects in the image. For this task, I will use the Facebook’s DEtection TRansformer (DETR) model, the most popular model in this category.
This model considers 91 different class labels (first being N/A), and you can find all these labels in the table below. Given an image file, the model identify if any of these classes exist in the image, and also provides a location of these objects in the given image.
For the demonstration, I will again use the same image. We can tell that there are two cats, two remotes, one blanket, and one couch in this image. Let’s see if we can recover this information using the DETR model.
# Load the Python modules
<- import('transformers')
transformers <- import('PIL')
pil <- import('requests')
requests <- import('torchvision')
torchvision <- import('timm')
timm
# Read the image
<- 'http://images.cocodataset.org/val2017/000000039769.jpg'
url
<- pil$Image$open(requests$get(url, stream=T)$raw)
image
# Load the feature extractor and model
<- transformers$DetrFeatureExtractor$from_pretrained('facebook/detr-resnet-50')
feat_ext <- transformers$DetrForObjectDetection$from_pretrained('facebook/detr-resnet-50')
model
# Extract the features
<- feat_ext(images=image,
inputs return_tensors='pt')
# Model predictions
= model(inputs['pixel_values'])
outputs = outputs$logits$detach()
logits = outputs$pred_boxes$detach()
bboxes
$shape logits
torch.Size([1, 100, 92])
$shape bboxes
torch.Size([1, 100, 4])
This returns two important objects. It was really the most frustrating because I couldn’t find any good documentation about what to do with these objects. Luckily, I find this discussion post, and the code included in the original question helped.
https://stackoverflow.com/questions/68350133/facebook-detr-resnet-50-in-huggingface-hub
I will try to explain as much as I understand. The first objects includes predicted logits for 92 classes in 100 different ways. We first transform these logits to probabilities using a softmax transformation, and then we can format it as a 100 x 92 matrix.
<- matrix(nrow = 100, ncol = 92)
logit_mat
for(i in 0:99){
+1,] = logits$softmax(-1L)[0][i]$tolist()
logit_mat[i }
I don’t know why 100, or what each row here represents. Also, Column 92 is discarded for some reason. So, it leaves the logits for 91 classes originally listed in the table above.
<- logit_mat[,-92] logit_mat
Finally, we do a search row by row and try to find whether or not there is any row that has a probability higher than a threshold. If we find such a row, we flag that row, and also identify the corresponding label for the column with a probability that exceeds the threshold.
<- 0.7
threshold <- unlist(model$config$id2label)
labels
<- c()
class <- c()
prob <- c()
id
for(i in 1:100){
<- which(logit_mat[i,]>threshold)
loc
if(length(loc) !=0){
<- labels[as.numeric(names(labels) )==(loc-1)]
cl
<- c(class,cl)
class <- c(prob,logit_mat[i,loc])
prob <- c(id,i)
id
}
}
data.frame(id = id,
class = class,
prob = prob)
id class prob
1 38 remote 0.9982
2 58 remote 0.9960
3 60 couch 0.9955
4 62 cat 0.9988
5 99 cat 0.9987
Superb! So, it seems the model correctly identified the cats, remotes, and couch.
What else? The model also returned boundary locations for these objects. Let’s check how they look like.
<- bboxes[0][id-1]$numpy()
coord coord
[,1] [,2] [,3] [,4]
[1,] 0.1685 0.1967 0.21154 0.09828
[2,] 0.5481 0.2711 0.05482 0.23982
[3,] 0.4998 0.4947 0.99961 0.98461
[4,] 0.2557 0.5448 0.46997 0.87265
[5,] 0.7701 0.4089 0.46089 0.71847
What are these numbers? The four columns are normalized coordinates for (X center, Y center, Width, Height). Also, I noticed that this is how the model imagines the coordinate system. Notice the range for the Y-axis is reversed.
So, these coordinates should be re-scaled and rearranged given the actual dimension of the image. The code below draw a rectangular box around each identified object using these coordinates and labels them.
# Read the image
library(magick)
Linking to ImageMagick 6.9.12.3
Enabled features: cairo, freetype, fftw, ghostscript, heic, lcms, pango, raw, rsvg, webp
Disabled features: fontconfig, x11
<- image_read('http://images.cocodataset.org/val2017/000000039769.jpg')
pic image_info(pic)
# A tibble: 1 x 7
format width height colorspace matte filesize density
<chr> <int> <int> <chr> <lgl> <int> <chr>
1 JPEG 640 480 sRGB FALSE 173131 72x72
# Set the actual width and height of the image
<- image_info(pic)$width
width <- image_info(pic)$height
height
# Descale the coordinates according to actual dimensions
1] <- coord[,1]*width
coord[,2] <- coord[,2]*height
coord[,3] <- coord[,3]*width
coord[,4] <- coord[,4]*height
coord[,
# Reverse the Y scale
2] <- height - coord[,2]
coord[,
# Finalized coordinates
# (X center, Y center, Width, Height)
coord
[,1] [,2] [,3] [,4]
[1,] 107.9 385.6 135.38 47.17
[2,] 350.8 349.9 35.09 115.11
[3,] 319.9 242.5 639.75 472.61
[4,] 163.6 218.5 300.78 418.87
[5,] 492.9 283.7 294.97 344.87
# Plot the picture in the plot object
plot(pic)
# For each of the identified object, draw a rectangular box around and label
for(i in 1:nrow(coord)){
rect(xleft = coord[i,1] - coord[i,3]/2,
ybottom = coord[i,2] - coord[i,4]/2,
xright = coord[i,1] + coord[i,3]/2,
ytop = coord[i,2] + coord[i,4]/2,
border = 'blue',
lwd = 4)
text(coord[i,1],coord[i,2],class[i])
}
Resources:
For attribution, please cite this work as
Zopluoglu (2022, Jan. 30). Cengiz Zopluoglu: R, Reticulate, and Hugging Face Models. Retrieved from https://github.com/czopluoglu/website/tree/master/docs/posts/huggingface/
BibTeX citation
@misc{zopluoglu2022r,, author = {Zopluoglu, Cengiz}, title = {Cengiz Zopluoglu: R, Reticulate, and Hugging Face Models}, url = {https://github.com/czopluoglu/website/tree/master/docs/posts/huggingface/}, year = {2022} }