Thursday, August 6, 2009

Getting Stats on documents added to SharePoint

I was tasked recently to figure out how many files were being stored in our SharePoint farm. Seems simple enough. Unfortunately there isn't a simple way built into SharePoint to give you simple information. Double unfortunately we didn't own the third party tools that would have made it easy. So I went right to where all the files were kept: the Content Databases.

Glancing through the tables a few of their names jumped right out at me: AllDocs and AllDocStreams. "This is going to be easy" I thought, just get a count(*) of these tables and I'm done. Just to be on the safe side I took a look at the data that was actually stored in these tables. It appeared that not only user loaded documents, but any page that sharepoint delivers is contained in these tables (AllForms.aspx, EditForms.aspx...).

Rather than try to hack away at the database myself, I found this blog which explained what data the tables contain and he had some clever SQL queries for delivering monthly growth by file quanty: http://suguk.org/blogs/sharepointhack/archive/2008/03/17/9161.aspx. I grabbed a piece of one of his queries to just a count of how many files are stored in one database:
USE (databasename)
SELECT COUNT(*) AS added
FROM
AllDocs INNER JOIN
AllUserData ON AllDocs.LeafName = AllUserData.tp_LeafName AND
AllDocs.DirName = AllUserData.tp_DirName AND
AllDocs.ListId = AllUserData.tp_ListId
WHERE (AllDocs.SetupPath IS NULL) AND alldocs.id IN (SELECT id FROM alldocstreams)
The problem now was that our installation had four content databases per application and 20 applications total. Rather than copy this query 80 times and changing the USE statement for each of them, I built out a stored procedure. The stored procedure cycles through all of the SharePoint content databases, inserts the number of files into a Table variable and finally spits out the final count with a summation query.
One note is that the entire procedure hinges on the fact that all of the content databases in our farm have the word "Content" in their name. Also they are the only databases that contain the word "Content" in the name on that server.

Here it is:
SET NOCOUNT OFF

DECLARE @strSQL NVARCHAR(1000)
DECLARE @strDatabasename varchar(200)

DECLARE @numFiles int

DECLARE @CountingTable TABLE(numberFiles int)

DECLARE MyCursor CURSOR FOR
select sys.databases.name as DB from sys.databases where sys.databases.name like '%Content%' AND state_desc='ONLINE'

open MyCursor
FETCH Next FROM MyCursor INTO @strDatabaseName
WHILE @@Fetch_Status = 0
BEGIN
SET @strSQL = 'SELECT @RecordCount = COUNT(*) FROM ['+@strDatabaseName+'].dbo.AllDocs INNER JOIN ['+@strDatabaseName+'].dbo.AllUserData ON ['+@strDatabaseName+'].dbo.AllDocs.LeafName = ['+@strDatabaseName+'].dbo.AllUserData.tp_LeafName AND ['+@strDatabaseName+'].dbo.AllDocs.DirName = ['+@strDatabaseName+'].dbo.AllUserData.tp_DirName AND ['+@strDatabaseName+'].dbo.AllDocs.ListId = ['+@strDatabaseName+'].dbo.AllUserData.tp_ListId WHERE (['+@strDatabaseName+'].dbo.AllDocs.SetupPath IS NULL) AND ['+@strDatabaseName+'].dbo.alldocs.id IN (SELECT id FROM ['+@strDatabaseName+'].dbo.alldocstreams)'

execute sp_executesql @strSQL, N'@RecordCount int OUTPUT', @numFiles OUTPUT
INSERT INTO @CountingTable (numberFiles) SELECT @numFiles

FETCH Next FROM MyCursor INTO @strDatabaseName
END
SELECT Sum(numberFiles) FROM @CountingTable

close MyCursor
DEALLOCATE MyCursor
Enjoy!

No comments:

Post a Comment