Arts and Benchmark - Nonprofit Technology

How Grok 3 compares to ChatGPT, DeepSeek and other AI rivals

Mashable Tech

FEBRUARY 19, 2025

xAI is promoting Grok 3 as the best model on the market, claiming it surpassed competitors from OpenAI , Google , Anthropic, and DeepSeek on key benchmarks. Shortly after the benchmarks were shared on the livestream, OpenAI product engineer Rex Asabor posted an "updated" chart with o3 beating Grok 3 Reasoning in math and science benchmarks.

Flash

Flash Benchmark Model Law

Benchmarking: Networked Nonprofits Measure Their Social Media Results In A Context

Beth's Blog: How Nonprofits Can Use Social Media

MAY 2, 2011

Arts and Social Media. At Zoetica, we facilitating a social media peer learning project called “ Leveraging Social Media: Becoming A Networked Nonprofit.&# Devon Smith, who writes the 24 Usable Hours blog, and a self-described “data nerd&# did a benchmarking analysis for participants. Benchmarking Study by Devon Smith.

Social Media

Social Media Benchmark Measure Results

Social Media and the Arts: How Strong Is Your Social Net?

Beth's Blog: How Nonprofits Can Use Social Media

DECEMBER 6, 2013

Note From Beth: Back in 2011, I had pleasure of facilitating a panel discussion Grantmakers in the Arts pre-conference on technology and media with Rory MacPherson and Jai Sen from Sen Associates where I learned about research study they were conducting about social media use in the arts. keep spending level.

Social Media

Social Media Arts Social Media

Webinars

The Everyday Donor: Unlocking Prospecting Segments Through Behavior Analysis

MORE WEBINARS

Universal Speech Model (USM): State-of-the-art speech AI for 100+ languages

Google Research AI blog

MARCH 6, 2023

USM is a family of state-of-the-art speech models with 2B parameters trained on 12 million hours of speech and 28 billion sentences of text, spanning 300+ languages. For en-US, USM has a 6% relative lower WER compared to the current internal state-of-the-art model. USM, which is for use in YouTube (e.g., Lower WER is better.

Language

Language Arts Model University

Data-centric ML benchmarking: Announcing DataPerf’s 2023 challenges

Google Research AI blog

MARCH 30, 2023

Even many of the standard datasets we use today have been shown to have mislabeled data that can destabilize established ML benchmarks. In this blogpost, we outline dataset development bottlenecks confronting researchers and discuss the role of benchmarks and leaderboards in incentivizing researchers to address these challenges.

Benchmark

Benchmark Challenge Data Training

The most innovative companies in artificial intelligence for 2025

Fast Company Tech

MARCH 18, 2025

The o1 model rose quickly to the top of the rankings in common benchmark tests, and soon Google DeepMind , Anthropic , DeepSeek and others were training their models for real-time reasoning. Even before the appearance of new reasoning models, some of AIs hottest companies produced state-of-the-art new AI systems.

Companies

Companies Model Training Train

Introducing MASK: A Benchmark for Measuring Honesty in AI Systems

The AI Alignment Forum

MARCH 5, 2025

Published on March 5, 2025 10:56 PM GMT In collaboration with Scale AI, we are releasing MASK (Model Alignment between Statements and Knowledge) , a benchmark with over 1000 scenarios specifically designed to measure AI honesty. 1] Many state-of-the-art models lie under pressure. Interventions: Can We Make AI More Honest?

Benchmark

Benchmark Measure System Evaluation

Little Giants, Big Money: Lessons in Social Media Fundraising From a Liberal Arts College

NonProfit Hub

JUNE 18, 2014

Wabash is small (901 students last year), all male (one of three such institutions left in the country), a traditional liberal arts school and the best decision I’ve ever made. Throughout the day as benchmarks and goals were met, Wabash set and announced new ones. Each group pledged $43,000 for their initial benchmarks.

Social Media

Social Media Arts Lesson Media

Startup layoffs, the art of reinvention and a MasterClass in change

TechCrunch

JUNE 25, 2022

Instead, I think that changes within a particular startup can be used as benchmark questions for their larger market; in other words, we can use the micro to better understand the macro. With that in mind, I want to talk about MasterClass’ decision to lay off 20% of its staff, around 120 people, across all teams.

Arts

Arts Change Lesson Celebrate

5 Online Communication Styles for Nonprofits

Nonprofit Tech for Good

NOVEMBER 30, 2014

If you are an arts and culture organization, think about crafting a tone of voice that is creative, clever, and entertaining. For many years the dominant benchmark for whether a nonprofit is successfully using mobile and social media has been if it engages or not, but engagement for the sake of engagement is a flawed communication method.

Communication

Communication Online Nonprofit Social Media

Microsoft’s new image-captioning AI will help accessibility in Word, Outlook, and beyond

The Verge

OCTOBER 14, 2020

The algorithm, which was described in a pre-print paper published in September , achieved the highest ever scores on an image-captioning benchmark known as “nocaps.” The nocaps benchmark consists of more than 166,000 human-generated captions describing some 15,100 images taken from the Open Images Dataset. (You

Images

Images Accessibility Benchmark PowerPoint

Tidbits

Zen and the Art of Nonprofit Technology

MAY 19, 2008

Home About Me Subscribe Zen and the Art of Nonprofit Technology Thoughtful and sometimes snarky perspectives on nonprofit technology Tidbits May 19, 2008 There are some really interesting tidbits of stuff out there. It’s titled “ Benchmarking With a Warped Stick.&# It takes aim at Convio’s recent benchmarking study.

Convio

Convio Kintera Benchmark Open Source

5 Online Communication Styles for Nonprofits

Nonprofit Tech for Good

FEBRUARY 23, 2017

If you are an arts and culture organization, think about think about crafting a tone of voice that is creative, clever, and entertaining. If your nonprofit focuses on human rights or poverty, for example, then your tone of voice should be serious, smart, and thought-provoking. Engagement.

Communication

Communication Online Nonprofit Social Media

Why HR’s future looks like marketing’s past

Fast Company Tech

MARCH 18, 2025

In 2009, when I worked at Gaps newly formed digital division, the finance team set benchmarks for success in e-commerce. Luckily, smart business leaders now realize that it takes a mix of art and science to get it right. And the results mattered more than ever. Along with the pressure, that limelight also brought opportunity.

ROI

ROI Channel Measure Results

AVFormer: Injecting vision into frozen speech models for zero-shot AV-ASR

Google Research AI blog

JUNE 2, 2023

The resulting AVFormer model achieves state-of-the-art zero-shot performance on three different AV-ASR benchmarks (How2, VisSpeech and Ego4D ), while also crucially preserving decent performance on traditional audio-only speech recognition benchmarks (i.e., LibriSpeech ). Unconstrained audiovisual speech recognition.

Model

Model Audio Avatar Phase

These 3 Types Of Nonprofits Spend The Most On Advertising According To This 2021 Study

Kindful

JANUARY 5, 2022

The Nonprofit Advertising Benchmark Study is a report from Whole Whale , a B Corp digital agency that works with nonprofit and social impact organizations. Despite this limitation, the data included in this study can provide benchmarks that might be useful in informing how much your organization would like to spend on advertising.

Studies

Studies Nonprofit Benchmark Arts

The 10 most innovative computing companies of 2025

Fast Company Tech

MARCH 18, 2025

With 1-Click Clusters, users of any size can get on-demand access to state-of-the-art Nvidia H100 Tensor Core GPUs and GH200 Superchips in a public cloud, enabling large-scale model training without having to lock in long-term contracts. In a May publication in Science Advances , researchers from Quantinuum, the U.S.

Companies

Companies Application System Model

25 SMART Social Media Objectives

Beth's Blog: How Nonprofits Can Use Social Media

JUNE 1, 2011

At Zoetica , we’ve been working on a peer learning project with arts organizations called “ Leveraging Social Media &# based on the social media lab. It also helps to break down your goal into monthly or quarterly benchmarks. SMART objectives can be revised along the way.

Social Media

Social Media Media Social Facebook

Fans are upset with Crysis Remaster’s graphics, so Crytek is delaying the game

The Verge

JULY 1, 2020

You can decide for yourself if the game looks too similar to a 10+ year-old game: Crysis , a sci-fi first-person shooter series originally released on PC, PS3, and Xbox 360, was praised for its graphical design and remained a benchmark for people looking to test the power of their gaming PC builds.

Game

Game Meme Benchmark Test

Don’t Set and Forget Technology–Regular Assessments Deliver Peak Performance

.orgSource

JUNE 12, 2023

Unfortunately, the head-spinning pace of innovations means that state-of-the-art becomes obsolete in a New York minute. But even if your IT systems are crushing their benchmarks, there are additional significant reasons to evaluate your digital status. If only we could take that casual approach to technology in general.

Technology

Technology Evaluation Associations System

January 2014Nonprofit Blog Carnival: Measurement and Learning

Beth's Blog: How Nonprofits Can Use Social Media

JANUARY 29, 2014

As KD Paine and I wrote in “ Measuring the Networked Nonprofit ,” measuring your social media channels, overall communications or marketing strategy is not a form of voodoo black magic; it is an art and a science. The only way to do that is to benchmark against your organization’s performance or peer organizations.

Measure

Measure Learning Case Study Social Media

Technology in the Arts Conference:Funding Technology Projects

Beth's Blog: How Nonprofits Can Use Social Media

OCTOBER 20, 2006

Find your peers who are in a similar situation and it doesn't have to be another arts organization. Jeff Forster mentioned that his survey of nonprofit technology benchmarks - comparable data from the last six years - the next segment comes out next week! He notes, "It isn't possible for one person to keep up. You need your peers.

Technology

Technology Arts Project Literacy

Benchmarking Data You Can Use: Live from #BBCON

Connection Cafe

OCTOBER 1, 2013

It’s easy to host a benchmarking session and just present data. Why benchmark? Jay Odell, Vice President of Altru at Blackbaud, discussed why benchmarking data is important. Most organizations aren’t lacking in opinions on where to focus, but benchmarking data tells you where you need to focus. Are you using benchmarks?

Benchmark

Benchmark Data Arts Blackbaud

10 Digital Marketing & Fundraising Trends for Nonprofits in 2022

Nonprofit Tech for Good

DECEMBER 8, 2021

1) Master the art of Plain Language. . M+R Benchmarks Report. You will then be sent an invoice for $20 that can be paid with a credit card. Once paid, you will be given access to the recording. Thank you and Happy New Year! Plain language is communication your audience can understand the first time they read or hear it.

Trend

Trend Fundraising Digital Marketing

Minecraft with RTX ray tracing launches for Windows 10

The Verge

DECEMBER 8, 2020

Minecraft traditionally shirks realism in favor of a pixel art, sandbox fantasy aesthetic, but the pairing works well here. All of these benchmarks were set under Nvidia’s specific testing conditions which can be viewed on their blog.

Reflection

Reflection Map Sample Instructional

Vid2Seq: a pretrained visual language model for describing multi-event videos

Google Research AI blog

MARCH 17, 2023

The resulting Vid2Seq model pre-trained on millions of narrated videos improves the state of the art on a variety of dense video captioning benchmarks including YouCook2 , ViTT and ActivityNet Captions. predicting the next token given previous ground-truth tokens). Learn more from the paper and grab the code here.

Language

Language Video Model Benchmark

Guest Post by Marc van Bree: Orchestras and Social Media Survey

Beth's Blog: How Nonprofits Can Use Social Media

DECEMBER 14, 2009

While there have been different surveys on nonprofit adoption, for example, these two recent studies I profiled last month, I wish there was a benchmarking study. Tags: Art Sector Arts & Technology. What are the valued metrics? I mentioned this on Twitter. I invited him to write a guest post summarizing the findings. .

Social Media

Social Media Survey Media Social

Advances in document understanding

Google Research AI blog

AUGUST 9, 2023

Consequently, academic benchmarks report strong model accuracy, but these same models do poorly when used for complex real-world applications. We list five requirements for a good document understanding benchmark, based on the kinds of real-world documents for which document understanding models are frequently used.

Template

Template Benchmark Evaluation Sample

Hot Off The Presses: Convio Online Marketing Nonprofit Benchmark Index™ Study

Connection Cafe

MARCH 14, 2011

Downtime may be hard to come by with SXSWi , NTC and AFP International activity this week, but just in case you have a few minutes to burn waiting for a flight, train or a panel to start, I’m arming you with some exhilarating reading material — the Convio Online Marketing Nonprofit Benchmark Index Study.

Convio

Convio Benchmark Studies Haiti

Tidbits

Zen and the Art of Nonprofit Technology

APRIL 17, 2008

Home About Me Subscribe Zen and the Art of Nonprofit Technology Thoughtful and sometimes snarky perspectives on nonprofit technology Tidbits April 17, 2008 I guess because I’m a blogger, I get these interesting tidbits in my mailbox. They’ve been doing some nonprofit research.

Open Source

Open Source Convio RSS Benchmark

Meta eyes LLM dominance with new Llama 3 models

InfoWorld

APRIL 19, 2024

This next generation of Llama demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning.

Model

Model Open Source Benchmark Classes

AI is coming for the laptop class

Recode by Vox

MARCH 13, 2025

Humanoid robots capable of tasks like folding laundry have been a longtime dream, but the state-of-the-art falls wildly short of human level. At the very least, an AI remote worker will have to use a computer fluently, and perhaps surprisingly, the best benchmarks we have, like OSWorld , do not show AI models doing that.

Laptop

Laptop Classes Job Model

Unicef’s Little Bet on Pinboard

Beth's Blog: How Nonprofits Can Use Social Media

NOVEMBER 28, 2012

Ami’s one and only board is called “ Really want these ” and instead of Louboutins, iPhone cases and nail art she’s dying to try, Ami’s pinned images include plain rice, faucets for clean drinking water, and chalk for school. It links to a donation landing page. Please tell us if you know of one! What does it look like? What did you learn?

Sierra Leone

Sierra Leone Case Study Benchmark Studies

Recent advances in deep long-horizon forecasting

Google Research AI blog

APRIL 20, 2023

A number of neural network–based solutions have been able to show good performance on benchmarks and also support the above criterion. However, other work has suggested that even linear models can outperform these transformer variants on time-series benchmarks. Left: MSE on the test set of a popular traffic forecasting benchmark.

Benchmark

Benchmark Metrics Training Train

DeepMind tests the limits of large AI language systems with 280-billion-parameter model

The Verge

DECEMBER 8, 2021

DeepMind’s research confirms this trend and suggests that scaling up LLMs does offer improved performance on the most common benchmarks testing things like sentiment analysis and summarization. To come to these conclusions, DeepMind’s researchers evaluated a range of different-sized language models on 152 language tasks or benchmarks.

Language

Language Model Test System

F-VLM: Open-vocabulary object detection upon frozen vision and language models

Google Research AI blog

MAY 12, 2023

Evaluation We apply F-VLM to the popular LVIS open-vocabulary detection benchmark. average precision (AP) on rare categories ( APr ), which outperforms the state of the art by 6.5 F-VLM outperforms the state of the art (SOTA) on LVIS open-vocabulary detection benchmark and transfer object detection.

Language

Language Model Open Training

Foundation models for reasoning on charts

Google Research AI blog

MAY 26, 2023

With these methods we surpass the previous state of the art in ChartQA by more than 20% and match the best summarization systems that have 1000 times more parameters. MatCha surpasses previous models’ performance by a large margin and also outperforms the previous state of the art, which assumes access to underlying tables.

Chart

Chart Model Foundation Language

This Nvidia RTX 4090 video wins 2021’s best April Fools’ joke

The Verge

APRIL 1, 2021

The video is a work of art, with subtle details like two power cords, RGB lighting, or the ridiculous GPU benchmarking tool that records more than 23,000 frames per second.

Video

Video Benchmark Arts Fun

Personal Stories – Arts Orgs Need Not Apply?

Connection Cafe

APRIL 21, 2014

Working with arts organizations there are often concern that your constituent stories aren’t as impactful. million likes surely someone touting the effect of music and art on their lives can get just as many. Annual Fund Fundraising Arts & Cultural museum' to participate in an event. If a picture of an angry cat can get 4.5

Arts

Arts Story Org Personal

BeGreatTV to offer MasterClass-like courses taught by Black and brown innovators

TechCrunch

FEBRUARY 5, 2021

.” When BeGreatTV launches in a couple of months (the plan is to launch in April), the platform will feature at least 10 courses — each with around 15 episodes — focused on arts, entertainment, beauty and more. The company’s $180 annual subscription fee accounts for all of its revenue.

Course

Course Teach Artist Celebrate

Unsupervised and semi-supervised anomaly detection with data-centric ML

Google Research AI blog

FEBRUARY 8, 2023

Using data-centric approaches, we show state-of-the-art results in both. average precision (AP) with a 10% anomaly ratio compared to a state-of-the-art one-class deep model on CIFAR-10. Lastly, on Thyroid (tabular data), SRR outperforms a state-of-the-art one-class classifier by 22.9 We consider methods with both shallow (e.g.,

Data

Data Ratio Sample Training

Bursting at the seams: Candid’s largest nonprofit compensation report yet

Candid

SEPTEMBER 14, 2023

Additionally, compensation in fields like arts and culture and mental health decreased or remained stagnant, while compensation in public safety and health grew the fastest in 2021. Location also remains a key factor when it comes to median executive compensation. These are just a few highlights from a report rich with key data and details.

Report

Report District of Columbia Nonprofit Data

AirTree backs generative AI content creation and management platform Narrato

TechCrunch

APRIL 3, 2023

Solanki explained that for both AI and non-AI content creation, users choose from templates, including blogs, articles, web copy, emails, video scripts, social media content and art. They also include research and benchmarking to help content creators reach a wider audience.

Content

Content Generation Platform Management

Online Fundraising Best Practices

Care2

JULY 16, 2010

2008 donorCentrics Internet Giving Collaborative Benchmarking Analysis); * The average online gift was $144.72, according to BlackBaud though M&R’s benchmark study noted that the average one time online gift was $81. How Does Your Nonprofit Measure Up?

Online

Online Fundraising Practice Blackbaud

How Grok 3 compares to ChatGPT, DeepSeek and other AI rivals

Benchmarking: Networked Nonprofits Measure Their Social Media Results In A Context

Webinars

Trending Sources

Social Media and the Arts: How Strong Is Your Social Net?

Webinars

Universal Speech Model (USM): State-of-the-art speech AI for 100+ languages

Data-centric ML benchmarking: Announcing DataPerf’s 2023 challenges

The most innovative companies in artificial intelligence for 2025

Introducing MASK: A Benchmark for Measuring Honesty in AI Systems

Little Giants, Big Money: Lessons in Social Media Fundraising From a Liberal Arts College

Startup layoffs, the art of reinvention and a MasterClass in change

5 Online Communication Styles for Nonprofits

Microsoft’s new image-captioning AI will help accessibility in Word, Outlook, and beyond

Tidbits

5 Online Communication Styles for Nonprofits

Why HR’s future looks like marketing’s past

AVFormer: Injecting vision into frozen speech models for zero-shot AV-ASR

These 3 Types Of Nonprofits Spend The Most On Advertising According To This 2021 Study

The 10 most innovative computing companies of 2025

25 SMART Social Media Objectives

Fans are upset with Crysis Remaster’s graphics, so Crytek is delaying the game

Don’t Set and Forget Technology–Regular Assessments Deliver Peak Performance

January 2014Nonprofit Blog Carnival: Measurement and Learning

Technology in the Arts Conference:Funding Technology Projects

Benchmarking Data You Can Use: Live from #BBCON

10 Digital Marketing & Fundraising Trends for Nonprofits in 2022

Minecraft with RTX ray tracing launches for Windows 10

Vid2Seq: a pretrained visual language model for describing multi-event videos

Guest Post by Marc van Bree: Orchestras and Social Media Survey

Advances in document understanding

Hot Off The Presses: Convio Online Marketing Nonprofit Benchmark Index™ Study

Tidbits

Meta eyes LLM dominance with new Llama 3 models

AI is coming for the laptop class

Unicef’s Little Bet on Pinboard

Recent advances in deep long-horizon forecasting

DeepMind tests the limits of large AI language systems with 280-billion-parameter model

F-VLM: Open-vocabulary object detection upon frozen vision and language models

Foundation models for reasoning on charts

This Nvidia RTX 4090 video wins 2021’s best April Fools’ joke

Personal Stories – Arts Orgs Need Not Apply?

BeGreatTV to offer MasterClass-like courses taught by Black and brown innovators

Unsupervised and semi-supervised anomaly detection with data-centric ML

Bursting at the seams: Candid’s largest nonprofit compensation report yet

AirTree backs generative AI content creation and management platform Narrato

Online Fundraising Best Practices

Stay Connected