Re: Horizontal scaling - advice needed



On 07.03.2007 21:04, Greg Lorriman wrote:
I am not sure whether I get your interaction correctly. What types of
things would have to be dealt with between users that are not done
through persistent state?

By interaction I just mean recording user relationships, like when one
user records another user as a friend on slashdot. They'll be a
'friend' table with two foriegn keys.

In my naive idea of things each machine would be equivalent to the
others. They would each have Webserver->Rails->Database. Interactions
between users means recording a new relationship between two users,
normally quite straightforward on a single database, but requiring
inter-machine communication and a certain amount of fiddling about
where the user accounts are on different machines/database-backends.
Two-phase commmit comes to mind, but I intend to work around the lack
of that.

In other words I'm shifting the scalability problem to the network
(routing, switches etc). I may also address the networking problem, in
the distant future, by migrating accounts based on usage patterns so
that user "interactions" will tend to be local to one machine.
Obviously that is a strategy with some interesting problems to be
solved, as you can probably immediately guess at, which I am looking
forward to.

Don't do that, i.e. don't work with multiple data stores because there are solutions out there that deal with all the replication and consistency issues (if you need it) - even in open source land. It's not worth reinventing *that* wheel because it's difficult to get right *and* fast.

As others have suggested, use one datastore. Start out with a single DB server (or a tandem if you need failover) and scale later. Your options depend on the DB vendor you are going to use so you should explore scalability options before you start (Oracle has it, MS SQL Server has it and MySQL has it also AFAIK, don't use MaxDB).

Btw, Oracle can scale pretty good when using Shared Server - of course this depends on the workload. See
http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14220/process.htm#sthref1644

When I wanted to do something like you and were targeting at a high profile commercial application I would go with Oracle. It has it's drawbacks but it has a prove track record for stability, it's a lot more flexible and tailored to deal with large volumes than other products (which at the same time also means more complex). DB2 seems also a likely candidate for large installations but I only have brief experience with that and I did not like some things I saw. That may have changed though.

Thanks for your words of wisdom,
Just a quickie as I'm on my way out: your Drbing will certainly hurt
horizontal scalability - apart from the issue of finding instances etc.

Do you think that my answer above addresses that?

It's probably also ok to assume some session stickiness as load
balancing routers can do that (for example based on IP) and this seems a
fairly common scenario. If not, you need some mechanism to make session
information available to all app servers (either via the backend store
or via some other mechanism).

Definately. I have had this in mind. Perhaps ultimately I would end up
with two (or more) domain (as in data) databases, one session
database, one proxy-redirection database, and the proxy-redirector
itself. And one day each running on their own machines. Right now I am
imagining several vmware instances to allow for developement.

I'd rather have the load balancing in the router. Routers are fast and built do to the job; creating a proprietary proxy for distribution based on some custom criteria is almost guaranteed to be slower than a router. (Even if the router is a Linux box, dunno whether such solutions exist).

Btw, if you are willing to pay for robustness you can also create a solution with VMWare ESX where a number of VM's can be served by a pool of physical machines; if one machine fails others can take over VM's. Sounds cool but is not cheap. :-)

Kind regards

robert

.



Relevant Pages

  • Re: "Correct" term for a 1:1 relationship between a "database" and an "instance" where > 1 such thin
    ... Nearly all the Oracle docs and books define a database something like ... unpartitioned physical server? ...
    (comp.databases.oracle.server)
  • Linked Server (Oracle 9i)
    ... Having problems connecting to an Oracle 9i database from within ... Microsoft ODBC for Oracle ... Windows 2000 - Advanced Server ... OLE DB error trace [OLE/DB Provider 'MSDAORA' ...
    (microsoft.public.sqlserver.odbc)
  • Linked Server (Oracle 9i)
    ... Having problems connecting to an Oracle 9i database from within ... Microsoft ODBC for Oracle ... Windows 2000 - Advanced Server ... OLE DB error trace [OLE/DB Provider 'MSDAORA' ...
    (microsoft.public.sqlserver)
  • RE: Vulnerability analysis tools
    ... Definitely you want your Oracle database behind a firewall. ... tell you the database is not meant to be exposed to the internet directly. ... 1- A web server hosted at an IDC ...
    (Security-Basics)
  • Re: problem connecting to db.
    ... Have you checked the listener log file on the database server? ... client is correct, and that the connection string that you are ... the Oracle client would be looking for a server ...
    (comp.databases.oracle.misc)