Report for Sigma/financial-sentiment-analysis

#78
by giskard-bot - opened

Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊

We have identified 3 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_allagree, split train).

👉Robustness issues (3)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.409 Transform to title case 409/1000 tested samples (40.9%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 40.9% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to title case(text) Original prediction Prediction after perturbation
408 Ragutis , which is based in Lithuania 's second-largest city Kaunas , boosted its sales last year 22.3 per cent to 36.4 million liters . Ragutis , Which Is Based In Lithuania 'S Second-Largest City Kaunas , Boosted Its Sales Last Year 22.3 Per Cent To 36.4 Million Liters . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
117 Finnish department store chain Stockmann Oyj Abp net profit rose to 39.8 mln euro ( $ 56.8 mln ) for the first nine months of 2007 from 37.4 mln euro ( $ 53.4 mln ) for the same period of 2006 . Finnish Department Store Chain Stockmann Oyj Abp Net Profit Rose To 39.8 Mln Euro ( $ 56.8 Mln ) For The First Nine Months Of 2007 From 37.4 Mln Euro ( $ 53.4 Mln ) For The Same Period Of 2006 . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
2155 Net sales decreased to EUR 91.6 mn from EUR 109mn in the corresponding period in 2005 . Net Sales Decreased To Eur 91.6 Mn From Eur 109Mn In The Corresponding Period In 2005 . LABEL_0 (p = 1.00) LABEL_1 (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.402 Transform to uppercase 402/1000 tested samples (40.2%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 40.2% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to uppercase(text) Original prediction Prediction after perturbation
118 Finnish lifting equipment maker Konecranes Oyj said on July 30 , 2008 that its net profit rose to 71.2 mln euro ( $ 111.1 mln ) for the first half of 2008 from 57.1 mln euro ( $ 89.1 mln ) for the same period of 2007 . FINNISH LIFTING EQUIPMENT MAKER KONECRANES OYJ SAID ON JULY 30 , 2008 THAT ITS NET PROFIT ROSE TO 71.2 MLN EURO ( $ 111.1 MLN ) FOR THE FIRST HALF OF 2008 FROM 57.1 MLN EURO ( $ 89.1 MLN ) FOR THE SAME PERIOD OF 2007 . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
875 Net sales will , however , increase from 2005 . NET SALES WILL , HOWEVER , INCREASE FROM 2005 . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
823 In the end of 2006 , the number of outlets will rise to 60-70 . IN THE END OF 2006 , THE NUMBER OF OUTLETS WILL RISE TO 60-70 . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness medium 🟡 Fail rate = 0.068 Add typos 68/1000 tested samples (6.8%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 6.8% of the cases. We expected the predictions not to be affected by this transformation.
text Add typos(text) Original prediction Prediction after perturbation
1919 Cash flow from operations totalled EUR 7.4 mn , compared to a negative EUR 68.6 mn in the second quarter of 2008 . Cash dlow from operations totalled WUR 7.4 mhn , comlared to a negative EUR 68.6 mn in the second wquarter of 2008 . LABEL_2 (p = 1.00) LABEL_0 (p = 1.00)
2217 Talentum 's net sales in September were smaller than expected . Talentum ' net sales in September were skaler tham expeted . LABEL_0 (p = 1.00) LABEL_1 (p = 1.00)
1917 Cash flow from business operations totalled EUR 0.4 mn compared to a negative EUR 15.5 mn in the first half of 2008 . Cash flow friom business opwrztions totalked EUR 0.4 mn comparef to a negarive EUR 15.5 mn in the first half of 2008 . LABEL_2 (p = 0.99) LABEL_1 (p = 0.66)

Checkout out the Giskard Space and improve your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

💡 What's Next?

  • The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.

🙌 Big Thanks!

We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!

Since this is a sentiment analysis model, we filtered the non-conform issues. The final scan report is as follows:

Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊

We have identified 1 potential vulnerability in your model based on an automated scan.

This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_allagree, split train).

👉Robustness issues (1)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness medium 🟡 Fail rate = 0.068 Add typos 68/1000 tested samples (6.8%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 6.8% of the cases. We expected the predictions not to be affected by this transformation.
text Add typos(text) Original prediction Prediction after perturbation
1919 Cash flow from operations totalled EUR 7.4 mn , compared to a negative EUR 68.6 mn in the second quarter of 2008 . Cash dlow from operations totalled WUR 7.4 mhn , comlared to a negative EUR 68.6 mn in the second wquarter of 2008 . LABEL_2 (p = 1.00) LABEL_0 (p = 1.00)
2217 Talentum 's net sales in September were smaller than expected . Talentum ' net sales in September were skaler tham expeted . LABEL_0 (p = 1.00) LABEL_1 (p = 1.00)
1917 Cash flow from business operations totalled EUR 0.4 mn compared to a negative EUR 15.5 mn in the first half of 2008 . Cash flow friom business opwrztions totalked EUR 0.4 mn comparef to a negarive EUR 15.5 mn in the first half of 2008 . LABEL_2 (p = 0.99) LABEL_1 (p = 0.66)

Checkout out the Giskard Space and improve your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

💡 What's Next?

  • The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.

🙌 Big Thanks!

We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!

The original scan report is below:

Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊

We have identified 3 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_allagree, split train).

👉Robustness issues (3)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.409 Transform to title case 409/1000 tested samples (40.9%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 40.9% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to title case(text) Original prediction Prediction after perturbation
408 Ragutis , which is based in Lithuania 's second-largest city Kaunas , boosted its sales last year 22.3 per cent to 36.4 million liters . Ragutis , Which Is Based In Lithuania 'S Second-Largest City Kaunas , Boosted Its Sales Last Year 22.3 Per Cent To 36.4 Million Liters . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
117 Finnish department store chain Stockmann Oyj Abp net profit rose to 39.8 mln euro ( $ 56.8 mln ) for the first nine months of 2007 from 37.4 mln euro ( $ 53.4 mln ) for the same period of 2006 . Finnish Department Store Chain Stockmann Oyj Abp Net Profit Rose To 39.8 Mln Euro ( $ 56.8 Mln ) For The First Nine Months Of 2007 From 37.4 Mln Euro ( $ 53.4 Mln ) For The Same Period Of 2006 . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
2155 Net sales decreased to EUR 91.6 mn from EUR 109mn in the corresponding period in 2005 . Net Sales Decreased To Eur 91.6 Mn From Eur 109Mn In The Corresponding Period In 2005 . LABEL_0 (p = 1.00) LABEL_1 (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.402 Transform to uppercase 402/1000 tested samples (40.2%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 40.2% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to uppercase(text) Original prediction Prediction after perturbation
118 Finnish lifting equipment maker Konecranes Oyj said on July 30 , 2008 that its net profit rose to 71.2 mln euro ( $ 111.1 mln ) for the first half of 2008 from 57.1 mln euro ( $ 89.1 mln ) for the same period of 2007 . FINNISH LIFTING EQUIPMENT MAKER KONECRANES OYJ SAID ON JULY 30 , 2008 THAT ITS NET PROFIT ROSE TO 71.2 MLN EURO ( $ 111.1 MLN ) FOR THE FIRST HALF OF 2008 FROM 57.1 MLN EURO ( $ 89.1 MLN ) FOR THE SAME PERIOD OF 2007 . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
875 Net sales will , however , increase from 2005 . NET SALES WILL , HOWEVER , INCREASE FROM 2005 . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
823 In the end of 2006 , the number of outlets will rise to 60-70 . IN THE END OF 2006 , THE NUMBER OF OUTLETS WILL RISE TO 60-70 . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness medium 🟡 Fail rate = 0.068 Add typos 68/1000 tested samples (6.8%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 6.8% of the cases. We expected the predictions not to be affected by this transformation.
text Add typos(text) Original prediction Prediction after perturbation
1919 Cash flow from operations totalled EUR 7.4 mn , compared to a negative EUR 68.6 mn in the second quarter of 2008 . Cash dlow from operations totalled WUR 7.4 mhn , comlared to a negative EUR 68.6 mn in the second wquarter of 2008 . LABEL_2 (p = 1.00) LABEL_0 (p = 1.00)
2217 Talentum 's net sales in September were smaller than expected . Talentum ' net sales in September were skaler tham expeted . LABEL_0 (p = 1.00) LABEL_1 (p = 1.00)
1917 Cash flow from business operations totalled EUR 0.4 mn compared to a negative EUR 15.5 mn in the first half of 2008 . Cash flow friom business opwrztions totalked EUR 0.4 mn comparef to a negarive EUR 15.5 mn in the first half of 2008 . LABEL_2 (p = 0.99) LABEL_1 (p = 0.66)

Checkout out the Giskard Space and improve your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

💡 What's Next?

  • The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.

🙌 Big Thanks!

We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!

Sign up or log in to comment