Warning: this content is older than 365 days. It may be out of date and no longer relevant.

You Ask, I Answer: Scaling Content Curation?

Jen asks, “How do you curate content for your social media channels?”

I have a list of trusted sources, about 150 of them, that I bring into a SQL database. I run several scripts to vacuum up the article text, index it, and then run natural language processing to identify the contents. After that, I have a pre-defined list of topics that I care about and score each article. I have a third script which scans the table for a certain number of articles, by score, and exports the data into a format for bulk scheduling. Watch the video for a tour.

You Ask, I Answer: Scaling Content Curation?

Can’t see anything? Watch it on YouTube here.

Listen to the audio here:

Download the MP3 audio here.

Machine-Generated Transcript

What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for watching the video.

In today’s episode, Jen asks, How do you curate content for your social media channels? That’s a complicated question to answer.

A few years ago, I was curating content, you know, reading through Feedly, and blogs and stuff like that.

And notice I kept doing the same thing over and over again, which took a really long time, a couple hours a week.

And I said, this is silly.

This is something that can clearly be automated.

One of the benchmarks for automation is if you do the same thing over and over and over again, and you do it more than twice, it’s time to start thinking about automating it when you are doing the same thing.

Every single day.

It is definitely time to start automating.

So the process for automating this thing is relatively straightforward.

It just as a lot of custom code.

Now there are paid services that do this kind of automation and they are all reassuringly expensive.

The reason I don’t use paid services is twofold.

One, I’m cheap.

And two.

More importantly, I don’t like the black box algorithm of how the services finding recommend content.

I don’t know what goes into the box.

And therefore I don’t know how it makes its decision.

Then when you ask vendors, they say it’s a proprietary algorithm.

I get that.

But I still want to know how it works.

So I ended up rolling my own.

What I did and how you can start thinking about doing this for yourself.

If you have the requisite technology skills is first you need a source of articles of content.

So what I’m going to show is bring this up here on screen.

This is the ahrefs SEO tool, one of my favorites.

Type in the topic of your choice, and you’ll see of course, a large collection of articles on your topics.

What you want to do is not focus on the articles but focus on the sources.

Also, if you already subscribe to some great blogs and like Feedly, or whatever service you use, export that list of blogs, you will need then a scraper to go out and read and retrieve those pieces of content and put them in some sort of storage mechanism.

I use a sequel database, which you can see here that pulls in each article, and then by URL from those RSS feeds.

Now, the important thing here is that I don’t want just the article title.

And I don’t want just the URL, I want the full article itself.

So one of the things that the software I wrote does is vacuums up the actual content of the article itself.

And you can see here on screen that a number of these that have the full text coming in.

That’s important because the next step in this process is to figure out is this an article that I would want to share is a topic irrelevant.

So there’s a stage there’s another piece of software on this database server that goes through and identifies This is something that I care about it also pulls in social shares, SEO, data from RF things like that.

In this case, we can see, there are a number of articles that are about data.

There’s one here about analytics and things and you can just spot check very quickly just by looking at the titles like is this a relevant article? Here’s what a data driven planning for city resilience, quantifying sea level rise.

Okay, that sounds like something that I would share.

There are also flags in here for things that I don’t want.

See, where’s that column there’s a there’s one called blacklist and that is essentially if I sort this column here Oh, I delete anything that’s below certain point.

articles about politics, especially certain politicians don’t want them don’t want to share them so they automatically get blacklist just knocked out.

never see the light of day.

The next step after that is to design them with social sharing links.

I have my own link shortener because I got tired Getting ads from the service I was using to hand over a lot of money a month for him.

So have a link shortener connected there.

And all this database processing happens on the database itself, and that prepares essentially 15 20,000 articles a month for processing.

And this this script runs this system here runs every 15 minutes or so.

So it’s not like once a month, because it does take time for the software to go out and do all this processing and scoring.

At the end, what you end up with is a scoring system, right? So at the at the very end, there is this resource here.

You can see these are the highest ranked articles based on those topics, not containing things they don’t want.

What is social media management, Instagram revenue and use the statistics right 41 best data science programs, these are all things that are perfectly on target for the kind of stuff I share.

So now the last step is to use a another piece of software.

But I wrote that goes in and takes all these articles blends in a series of social shares of things that essentially are ads, right? things that I want to promote like my newsletter, like the Trust Insights, newsletter, whatever oddities I want to promote and mixes them and do so there’s from a content curation perspective is 25 ads, there are 50 articles, so you know about two to one ratio there.

And then there’s a thank you section as well, where I’m pulling in additional 25 articles that are all things that other people have written about TrustInsights.ai I want to make sure that we’re sharing the love thanking people for covering the company, right, that’s an important thing to do.

This will so together A at the end of the process, one single CSV file and it looks kind of like this.

This then goes into a Gora Pulse buffer, Sprout Social whatever system you want to use to share your content.

This is all pre built out and this is fresh content one of the restrictors on the system is it has to be only contents and shared in the last seven days.

And what the summary is is like cleaning loading the different social shares topic scans link shortening content scan at the end of this process as a right now there are 321 articles that I could be sharing with you that are published within the last seven days that are topically relevant out of the 5000 so each week that are raw inputs.

Now this system is very technology heavy and you can see the user interface kind of sucks.

Actually, no it it does suck if you if you’re used to having a really nice polished interface.

This is this is not something that that is going to do any good which is one of the reasons why it’s Not for sale, right? It’s it is not a product that you could just buy off the shelf and run on your own servers.

Trust Insights does offer it as a service.

If you want our help to get a file, we have a couple of paying clients who get weekly files from us.

But for those folks, you know, we work with them to tune their topics and tune their stuff so that they have input into the algorithm, but ultimately, they’re not maintaining the algorithm or the infrastructure.

Like I said, the reasons I do this twofold.

One is to I know how the articles getting chosen.

And when something comes up that I like, I don’t like that kind of article.

I don’t want that kind of content in my social shares.

I can go in under into the system itself and write exceptions right.

rules or or change the code around to say like, this is not something I want anymore.

Now, there is some but not a ton of machine learning in this and one of my goals.

For 2020 is to upgrade the article selection process to instead of using manual tagging, to use supervised learning as a as a way to process the articles, and get even better targeting, but that’s going to require a lot of work that’s gonna be all those things probably gets done, you know, When, when, even when it gets slow.

But that’s how I do the content curation.

This process, and the system has taken probably four years to write in tune over time, and there are constantly new changes coming in as you know, new clients come on who want this service or as I see things and learn things that I want to improve on it.

That changes the system to its, its ongoing.

If I were going to start over from scratch, I’d probably do some of the back end architecture a lot differently.

Because it was built with my skills at that time and as my skills evolve, the system evolves but it’s still not it’s still not where it could be yet Where needs to go.

To build something like this yourself, you need SQL database skills.

You need a scripting language as web compatible like PHP, Python, etc.

And you need data processing language skills like our or Python in order to be able to create the scripts that you need.

And so them all together into one system.

So those are the sort of the three sets of skills you’ll need to implement a system like this.

I would strongly recommend that you come up with your own algorithms and it and you may want to a user interface I don’t I don’t particularly need one.

But you may want to use your interface if you’re going to do this yourself.

But that’s how it works.

That’s how the system works.

It’s it’s been a pet project for years and it continues to grow.

And I hope you found this useful for thinking about how you could build your own system like this.

As always, please subscribe to the YouTube channel and the newsletter will talk to you soon.

What help solving your company’s data analytics and digital marketing problems.

Visit Trust insights.ai today and listen to how we can help you


You might also enjoy:


Want to read more like this from Christopher Penn? Get updates here:

subscribe to my newsletter here


AI for Marketers Book
Take my Generative AI for Marketers course!

Analytics for Marketers Discussion Group
Join my Analytics for Marketers Slack Group!