I have occasionally dabbled in database design over the past few years. I once toyed with the idea of a church management database, which never really went anywhere. Mind you, the experts that examined it really liked the overall design, but I had some work to do with making it actually useful. Fast forward to the present, and I’ve tinkered a little more, and finally developed my own income and expense database that, so far, I really really like. Again, I still have some work to do yet, especially showing expenses for one, two or even three levels of categories. Still, the basic functionality is in place and seems to work very well. Here’s a little of what I did.
I could have simply used GnuCash or KMyMoney or other personal finance software, but then I would not have learned what I have about databases in general. Besides, I can’t stand to have to “sync” every single transaction – what I really want is to just see what I’ve earned and spent, and where all that money went. I chose PostgreSQL because OpenOffice.org & LibreOffice are currently experiencing problems with their development of Base. I also wanted to learn more about PostgreSQL, and Base’s development issues present a great opportunity to delve into the former. And once I connect LibreOffice to PostgreSQL, I should be able to create all the pretty charts that the personal finance software can create.
If you don’t know what PostgreSQL is, it is what is commonly called a “Relational Database Management System” or RDBMS. It is comparable to Oracle and MS SQL Server, and used in many quarters by some rather notable “customers”. It’s features make it one of the absolute best database systems available. It is, of course, libre (or what some might refer to as “open source”) software.
I bought “Beginning Databases with PostgreSQL” (APRESS) through my cell phone’s Kindle app. Incidentally, Kindle on a cell phone is great for novels, not so great for technical books with charts and tables that you might want to read. Worse, Amazon still has no version of Kindle for GNU/Linux distributions, and I don’t see a way to read the book via my web browser, either. Otherwise, the book is a great buy, and has helped me understand some things that basic documentation does not answer – at least, not very well.
I created 4 simple (more or less) tables: Entity, Category, Transaction, and TransactionDetails. That last one is thanks to the examples I’ve seen using “orders” and “order details” in tutorials. This last one is especially important, since it allows me to draw out very specific details, if I really want to. The Entity table, of course, stores the basic data for myself and businesses or other persons with whom I may do business. The Category table should be self-explanatory, but it currently holds up to 3 levels of categories. The Transaction table only holds the transaction date, and possibly a note about the general transaction. I make no attempt whatsoever to categorize the general transaction because a single transaction may frequently be split into multiple categories.
A note on the categories: Some people will not want the most minute details on every single transaction. That’s fine. If you don’t want to record every item you bought at the grocery store, you can just use the “Groceries: General” category, and be done with it. But the design lets one drill as deep as one wants to get. There may be times when you only want to see the top-level categories, but there might also be reason to drill deeper. This format lets the user decide at what level to stop drilling.
The TransactionDetails table is, frankly, where the majority of the “action” is. Well, the reality is that the “action” is in the queries, not the tables. Still, I can split a paycheck to show what Uncle Sam has taken out. I can show every item I purchased at the grocery store, and later run a report to see how much money I spent on fruits and vegetables, or on dairy products. I should pause here to say that my database does not show my current bank balance. I did not make any effort to do so. That said, one can create a category called “Accounts”, with subcategories for “Checking”, “Savings”, etc. Then, when I make a transfer, I simply perform a bit of double-entry accounting, showing the minus transaction for one account and a plus transaction for the other. Performing balance checks should not be overly complex either.
The following query shows how I have pulled out the Date, Entity and Amount, ordered by the date:
SELECT
transaction_date AS "Date", entity_name AS "Store",
sum(transdetails_amount) AS "Amount"
FROM
public.tblentity e, public.tbltransaction t, public.tbltransdetails d
WHERE
e.entity_id = t.entity_id AND t.transaction_id = d.transaction_id
GROUP BY entity_name, transaction_date
ORDER BY transaction_date;
The above SQL commands select the date, entity and the total amount for each transaction from the three main tables, groups them by the entity name and transaction date, and then makes sure the result is ordered by date (in this case, from oldest to most recent). Incidentally, the result shows my net pay from my most recent paycheck, as well as the totals for all my purchases at each restaurant or store. Oh, and it does all this in about 0.17 milliseconds. You’ll forgive me, I hope, for not showing the actual results, but maybe at some point I can produce some sample data that I won’t mind sharing.
This SQL shows the amounts by catgory:
SELECT category_name AS "Category", sum(transdetails_amount) AS "Amount"
FROM tblcategory c, tbltransdetails d
WHERE c.category_id = d.category_id
GROUP BY category_name;
The result shows what I spent in each specific category. I have not yet attempted to only categorize by the first-level categories. That should prove to be an interesting exercise, and I’ll try to return once I’ve done that. This query shows “income” as a gross amount, but also shows the taxes in their respective categories. Incidentally, I understand the design of my categories table (which I picked up here) will only go so far, but for most of my purposes it should work just fine. The main thing here, though, is that I can see where my money is going.
Another thing I should note is that, while writing SQL queries at the command prompt is not necessarily the worst thing in the world, using PGAdmin III certainly improves the work environment. I did not use the graphical query builder, except to display the tables (with their column descriptions), so I could see which columns I wanted to pull from. I can click between th SQL tab and the graphical tab to make sure I’ve crossed my ‘”t’s” and dotted my “i’s”. Also, you can save the queries once you’re sure they work correctly, and then you can save the resulting data into CSV files which you can then open in any spreadsheet.
Understand that this is more of a show-n-tell than a howto. I just wanted to share a little of my adventure in exploring the PostgreSQL database system and how one can do some really cool stuff, whether for personal or business matters. If anyone is interested, maybe I can devote a future post to the table design. Also, if you had to think of a name for this little database, what would you call it?