What is the best way to offload data somewhere else from a sharepoint library?

So what is the situation when there is no retention policy specified .. at all?

View attachment 14451


93 days

The defaults haven't failed me yet honestly, everything people need is in the bin I just go "Restore" it.
 
I have an Archive team, I move all sharepoint stuff that I need out of there to it, and it's configured on someone's desktop somewhere to syncall to it. I have to verify that machine has it all, move everything into other storage of some kind, then delete it from the team, and go online and empty the sharepoint recycle bin for that team.

This makes perfect sense.....as long as you have the ability to make the call for this step:

move all sharepoint stuff that I need out of there to it (emphasis mine)

When trying to manage this for clients, only they can really know "the stuff they need out of there" - i.e. what data can be archived. A big issue here is that there aren't many built-in tools to help identify this data. You contact the client, tell them that they are running out of space and they ask you to: "Make me a list of all files that haven't been accessed in 5 years". Maybe you can do that in powershell, I don't know. Then they ask "Ok, go ahead and archive that". Well, it's not that simple. Presumably they don't just want random old files put in a single archive folder on some storage service. You would need to maintain a similar folder tree on that service so they could actually find that data if some day they needed it.

THAT means you would need to create "X:\Clients\R\RDC001\Projects\2016" folders there to accept the data. I know I wouldn't want that job unless it could be automated, but can it? If you can't automate it, then this is a project that will never be done consistently.

Don't forget also that it's not just file sizes that are a problem. I can image many folks would run into the limitation on NUMBER of files before they hit the space limit.

This entire issue is largely unknown by the average client, and it is a difficult task for us to get them to assign value to, enough to do anything about it until limits are hit and there are problems.
 
This makes perfect sense.....as long as you have the ability to make the call for this step:

move all sharepoint stuff that I need out of there to it (emphasis mine)

When trying to manage this for clients, only they can really know "the stuff they need out of there" - i.e. what data can be archived. A big issue here is that there aren't many built-in tools to help identify this data. You contact the client, tell them that they are running out of space and they ask you to: "Make me a list of all files that haven't been accessed in 5 years". Maybe you can do that in powershell, I don't know. Then they ask "Ok, go ahead and archive that". Well, it's not that simple. Presumably they don't just want random old files put in a single archive folder on some storage service. You would need to maintain a similar folder tree on that service so they could actually find that data if some day they needed it.

THAT means you would need to create "X:\Clients\R\RDC001\Projects\2016" folders there to accept the data. I know I wouldn't want that job unless it could be automated, but can it? If you can't automate it, then this is a project that will never be done consistently.

Don't forget also that it's not just file sizes that are a problem. I can image many folks would run into the limitation on NUMBER of files before they hit the space limit.

This entire issue is largely unknown by the average client, and it is a difficult task for us to get them to assign value to, enough to do anything about it until limits are hit and there are problems.

Really about the number limit? It looks like a sharepoint document library has a limit of ... 30 million files/folder? I mean I know that things can add up quick but ... 30 million quick?

I just emptied their recycle bin and then dumped all the 93 days worth of files out of the 2nd stage recycle bin. Any idea how long it takes for SharePoint to re-calculate available storage?

It looks like we are well within the 30 million items limitation?

1680021146971.png

 
I think I just found part of the issue after emptying the recycle bin and the 2nd stage recycle bin I started digging a little further and I found the versioning settings in the site settings and it looks like the versioning for this particular library is left at default which is 500 versions of the files are kept. I opened up a basic current excel spreadsheet and found 177 versions of it all around 400kb in size (I think) ... so I'd imagine that this is multiplicatively ballooned by other larger more accessed files. If I change this setting down to something more reasonable ... say 10 versions instead of 500 does anyone know if sharepoint will automatically just go through the entire library and dump the old versions?

@YeOldeStonecat
@Sky-Knight
@HCHTech

1680027605458.png

1680030209116.png
1680030236840.png
1680030260824.png
 
Last edited:
And each copy is deduped, so you aren't going to get back the savings you think you will.

As do your question, I don't honestly know. But if it's anything like the rest of M365 those settings won't hit a file until someone accesses it, or the maintenance agent finally gets around to it. If it's the latter, you could wait up to a month.

@HCHTech Correct, which is why the Archive team the business owner has access to, and they're left picking what goes in there. IF they have the skill to handle it. Honestly, I haven't had anyone even hit the stock storage limits yet, and the ones that got close sorted themselves out when I told them they're about to buy more space.
 
I never played with those settings to "trim" retention....I always set my clients tenants to "never delete".
Largest storage consumption I have is with an engineering firm of around 20 peeps...tonsa CAD drawings and spreadsheets, they're at around 60% filled.
 
Wild how no one is touching the 1TB +.15 limit and I'm here sitting at over double the default cap.

1680107622218.png
 
I've had clients that had that...and more...on their "on prem server"
But...before lifting that to 365.....we had a few sessions for grooming their data ahead of time..so we only lifted important/current stuff, and a little bit of "old stuff"...but not all of it, esp the REALLY old stuff.
 
1st thing is to sit down and figure out the value of "saving a few bucks". Isn't this the client you mentioned else where that was like a many many millions of dollars per year business? If that's the case, is an extra $200 a month worth even debating over?


There are a few low hanging fruit options. Primarily cold storage. Do they need quick and easy access to stuff that's more then 12-18 months old? Sounds like over half the data they have shoved up there can just get shoved off to cold storage.

Each individual 0365 license also gets access to 1TB worth of one drive storage, which isn't much worse than sharepoint storage. Even if they had to spin up a few "dummy" users just to get access to that 1TB worth of space for stuff that maybe is inconvenient to go into cold storage, but they don't wanna fork over 20c per GB for sharepoint space expansions.

Another thing is to "clean" the data - gets very expensive because it can be insanely labor intensive if their practices were poor in how they handled storing data. Organizing data, deleting dups, killing off space eaten up by crazy numbers of file version... that sort of stuff.
 
Digital hoarding eventually comes back to bite those that do it in the posterior.

I find it's usually a combination of the people doing it don't know any better and or are terrified of losing anything.

Folks either seem to throw caution entirely to the wind (I.E. their "server" is a 15 year old windows XP desktop from wal-mart with the entire C: drive shared with "everyone" on the network OR they are so entirely convinced if they lose track of a single word doc their entire business will be shut down the following day.

But yea, regardless of WHY... it's still a major PITA for whoever has to sort it out.
 
Two recent things that came into my orbit on this subject...

1.) While training for CISSP, Data Classification. Data needs to be classified, and along with that a lifecycle established, failure to do this has legal consequences far more drastic than the technical consequences. And yes... SMBs SUCK at this, they all seem to want to keep things forever. But from a legal perspective you cannot have data discovered that you didn't keep!

The key note here is these process are NOT IT PROBLEMS, they are BUSINESS PROBLEMS. And if the business refuses to solve the, the IT cannot ever be sorted, and IT can never be responsible for anything ever. The lightest of negative consequences here is people paying for extra Sharepoint space. Because classifying data, and doing the correct thing is time expensive, and difficult, and far too many simply choose to skip it.

2.) While working with one of my project Teams on this very issue:

SharePoint has TWO relevant features.
1.) Retention Policy
2.) Past Versions

And here's where things get really dumb... the two features ARE NOT COMPATIBLE WITH EACH OTHER, nor do they integrate in anyway.

You can tell Sharepoint, hey... I want to keep 100 versions of any given file, and it'll happily do that. But if you do that and then tell SharePoint via Retention policy hey... I want you to keep these files for 3 years as soon as you do, you get infinite versions of every file until they're 3, then hey get deleted or whatever in accordance with the retention policy.

There is no way at present to configure Sharepoint to say hey, I want to keep 50 versions of every file, and you can nuke the parent file entry and all attached versions in 3 years but until then no one can delete anything.

Think about it this way, if Retention Policy says you cannot delete a file for 3 years, once a file is impacted by that policy it cannot be deleted! This INCLUDES all versions of the file too! This results in thousands of versions of a single file stuck in storage for 3 years potentially in this example. Not good! That being said, it's EXTREMELY HANDY for legal issues, because the entire document's history is maintained for future review. Which is exceedingly important to larger orgs, but smaller ones... they don't want to pay for this. Never forget all things Azure are designed for Fortune 100 and up!

So if a client is running out of space, the retention policies MAKE IT WORSE! What you need to do is instead set the Past Versions setting, then run some powershell magic to reset the value on each already stored file AND THEN YET ANOTHER powershell to delete all the past versions beyond the configured value.

Don't ask me HOW all this is done, but I can tell you one of my Azure engineers has been fighting this process all week for a client. It's not pretty. To meet compliance objectives this customer decided to have us clean up SharePoint, reset it on version counts only, remove all Retention Policies aside the defaults at 90 days, and deploy Datto SaaS backup with infinite retention externally because it's cheaper to get more storage from Datto than it is from Microsoft and they have to keep everything for legal reasons for 7 years.

So yes, this crap IS THIS BIG, it's an enterprise problem, and congrats @thecomputerguy it's in your SMB space.
 
Back
Top