morphine wrote:Then I think your specs should suffice.
Alright, well at least I'm on the right track.
morphine wrote:Regarding the key/value store, not trying to tell you how to do your job, but it sounds like you'd be better of if you used a database of some sort, potentially coupling it with Memcache. The reason being that RDBMs tend to be much better at handling caching by keeping hotly accessed data in memory, you can take better advantage of indexes, etc.
I'll defend the decision here for a second: All modern operating systems do the same exact thing automatically actually. Anything recently accessed is cached as long as there is RAM to spare. So we figured why not let the OS take care of caching for us, rather than setting up and maintaining a dedicated DB? All we need is a simple key/value store, so if that's not good enough then we might just add a Redis layer on top of that. Besides, finding a file is a B-tree lookup right? It's as fast as a database for large amounts of data. A filesystem is just a database at heart and we're looking to leverage that to our advantage since we don't need any heavy joining/filtering/etc.
But I do see what you mean. We're opting for simplicity at the cost of a slight performance penalty really. We're hoping it's negligible enough.
JdL wrote:The reason it works is because the app is NOT using most of the traditional RDBMS features and behaves as a simple key-value lookup.
That's what we're hoping for. Maybe I'll look into pre-emptive caching like you describe, but our dataset can be several gigabytes and have several million entries.
JdL wrote:As developers / administrators, having less configuration / apps running to deal with / maintain is almost as important as the functionality itself. KISS principle.
morphine wrote:JdL, that's correct, but the OP specifically said he's reading from disk, not from memory, with disk I/O being heavy. If he was just storing stuff in memory, then I'd never recommend another layer of complexity
I'm replacing an old system that stores everything in memory, and is constantly out of memory. We're using the disk because scaling is much much easier, and as you both considered, it's simple.
JdL wrote:Spec-wise, it is ideal to have at least 1 CPU thread available per-nodejs-process, plus 1 or 2 free for OS / network resources. So make sure your CPU(s) can juggle enough threads.
Noted.
JdL wrote:Also - RAM is cheap right now. 16GB is low if your db (whatever it is) grows. SSD's will help mitigate RAM sizes, but then you're eating into your disk space.
Haha, I don't think any of my teammates are going to let me get away with less than 32GB from the sound of it. The K/V DB doesn't cache anything itself, I am cheating by relying on the filesystem. So additional RAM will help us out.