Wednesday, July 7, 2010

Repositories

I wonder if I may be taking "repositories" in the wrong direction in my code, so (in a similar light to Sean's last post) I'm looking for some feedback from the group.

In the anemic domain model that I'm using, crudely sketched in an earlier post, I have a handful of "domain cores" (the big balloon thing in the sketch) and a bunch of pluggable "repositories" that get injected into them.  But I'm using the term "repository" more broadly than, say, a table in a database.  I'm using it as more of a business area that involves some kind of IO.  So I have repositories such as:

  1. FileSystemRepository
  2. EMailRepository
  3. CoreDataRepository (the company's core critical data is part of a 3rd party application suite)
  4. CommonDataRepository (a database of common data that internal applications share)
  5. HRDataRepository (the database used by the company's 3rd party HR software)
  6. ActiveDirectoryRepository
  7. [InsertVendorHere]Repository
  8. etc.
So, since the "repository" can interact with different kinds of data in different ways, it doesn't just implement CRUD procedures.  Its procedures have a little more business meaning.  For example, the EMailRepository might have a method called SendNotificationsFor[InsertBusinessProcessHere](args) that will, based on the arguments it receives, send out some email notifications.  Or the ActiveDirectoryRepository might have GetUsersInGroup(args) that takes a group name or group ID of some kind of returns a list of user objects.

In this design, anything specific to the repository is kept in the repository.  The methods only accept/return standard types or domain objects.  So the CommonDataRepository would never return a DataSet or a list of an internal entity type, it internally converts anything from the data source into a domain type and returns that.  The idea there is to keep any knowledge of the actual I/O strictly inside the repository.

So I guess the first question is, am I off-base here?  Is this wrong?  Maybe I just need to not use the terminology "repository" in describing these things?

Further down this road, I have another question about how these are organized.  Basically: Should repositories be able to internally reference each other?

For example, let's say we're getting the users for a group from the ActiveDirectoryRepository.  It will return a List where "User" is a custom domain user object.  Let's also say that the User object has some fields on it that are populated by the HRDataRepository.  So any time you populate a User object, it's going to need to contact both of these repositories.

Currently I have it set up such that, internally, the ActiveDirectoryRepository has an injected dependency on the HRDataRepository and, any time it builds a User object (or list of User objects, etc.) it calls the HRDataRepository.  The reasoning for this is, again, to keep knowledge of this in the repositories.  Any developer who's writing code to hit that repository just expects back a list of Users, all good and proper.  We certainly shouldn't expect every business logic function in the domain to manually hit both repositories and combine the data, should we?

Am I looking at long-term support problems here?  Does anybody have any thoughts or insight on this?  The IoC of the whole thing works out fine because the repository assemblies (each one is its own project) reference the domain core assemblies into which they'll be injected, so they can see the interfaces for each other inside those assemblies and the IoC container hooks them up just fine.

Thoughts?

5 comments:

  1. I had to read part of the entry for the Repository pattern to see what the definition was but it sounds like it's just to be the layer between your domain and your data mapping as well as a place to contain query code. So I want to say that a repository (by definition) should be just something that can retrieve and possibly manipulate your models. The source that it gets it from (file system, OS properties, database) doesn't matter I think.

    Some of your use cases for repositories sound like what I generally put in my domain services. I'm trying to use these domain services as our public API into the system so our windows services and web application are built on top of them.

    For me every system interaction has to go through a domain service and each one could depend on other services or repositories. Sometimes they are a pass through to a single repository and sometimes they wrap up multiple calls, context/action specific business logic, and/or package data together if it is coming from different sources. Nothing outside of the domain is suppose to know about the repositories and I'm undecided if I like repositories using each other.

    I prefer that repositories purely do data query/command style stuff. So just: I want these or save this. Then the domain models represent the data, the relationships between that data, and any business logic/rules around those. They don't know anything about the domain services or the repositories so if they need something extra to perform a check or action it has to be provided for them as a parameter to the method. Finally the domain services act like the API into the system and are where procedural logic happens, context specific logic, and making calls to external services. The external services are described as interfaces in the infrastructure namespace (as are the repositories) and implementations exist in separate projects. The implementations take care of the mapping between 3rd party types and our own internal models as needed.

    ReplyDelete
  2. Obviously system nomenclature is a matter of preference, however in our system:

    Repositories are specified as handling acquisition and persistence of models. They are only the CRUD operations.

    Services are the components that utilize models to perform various generic operations. These are usually not process specific, so things like email, printing, integration calls, things like that.

    Processors embody the standard business processes we perform. All the specific business actions the system needs to performed are logically located within them.

    Domain Events centralize all the contextually based custom logic. This allows us to segregate the custom logic from the standard operations.

    ReplyDelete
  3. I think, as you say, it's mainly a nomenclature thing then. Maybe I should rename my repositories to services just to avoid any future confusion.

    The "persistence of models" is really cloudy here since the application ecosystem is spread out among so many disparate parts of the business. Something as simple as a "User" object can easily link back to 4 or 5 different data stores (or more), not all of which are writable or even databases at all.

    I imagine there will be plenty of room for proper repositories inside these services as the ORMs begin to take shape, but that's a bridge crossed at another time.

    ReplyDelete
  4. Something that is pushed with Domain Driven Design (not to say that any of us are following it) is the idea of a context. So terms might be reused between contexts but could mean different things. They take this from how normal domain vocabulary could differ between departments or specific fields.

    So perhaps you have 5 different specific User objects that may get have to be worked with in the different contexts, or you have a single User idea in the system but 5 different contexts that you can work with them in.

    Also I would suggest trying to make your Repositories follow the pattern definition for the most part. I say that because having a consistent terminology that we as developers can speak about will only help us in sharing ideas. Technically my services are like Transaction Scripts according to Fowler.

    I like the idea of having this specified business processes in Processors. I haven't made that leap yet in my system design (although there are only really a couple of processes) and it all sits in the services (like I said our public API of sorts). What I do need to do is separate between internally and externally available services (possibly where processors and services come in). Although I think for not I better not introduce too much more, I still hear complaints about how slow the system is to develop in and how there are all these layers and abstractions.

    I do have to say that my services are fairly procedural. They do have dependencies that they take in so that provides some method of extension and overriding of behavior but I do wonder if these types of things would flow and be easier to work with in a functional language where functions are real first class citizens. My domain model makes a lot of sense in the object-oriented world though and I suppose repositories as well.

    ReplyDelete
  5. Well I'm starting to run into situations where multiple repository actions have to happen to keep some relationships consistent. So I'm starting to learn towards just letting the Repositories use each other to keep everything in sink.

    Pushing this stuff down towards the Repositories means that the persisted state has to be this way and it makes the Domain Services more likely to be a pass through.

    Of course the Repositories can get each other injected the same as anywhere else and I'm starting to think that maybe for the pass through scenarios the Repositories could implement the Domain Services interfaces as well to reduce code maintenance.

    I'm not sure how I feel about that though. You guys have any opinions on this?

    ReplyDelete