I filed an issue on the lemmy and kbin issue trackers to address duplicate communities. If you have an #ActivityPub development experience/knowledge, please take a look and offer feedback. If not, please offer any feedback here.

  • fl3tching101@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    I was actually just thinking about this earlier today. I definitely think this could turn into a problem. People are drawn to where other people are, so if a new user joins a smaller instance and goes looking for (to use your example) a gaming community, they will see that their smaller instance has say a few dozen posts, but the lemmy.ml instance version might have hundreds, thousands, hundreds of thousands, etc. So where do you think they will mostly look? And if they go to post something, why post it where nobody will see it on the local small community? This isn’t necessarily inherently a problem in a perfect world. If popular communities are spread evenly among instances then it works out, but that is also unlikely to happen as potential users will want to join popular instances for the same reason as the community above…

    I’m not at all an expert on the ActivityPub protocol or federation in general, so take this all with a grain of salt. So as to the solution proposed, I don’t quite think that would be a good solution for one main reason which is the duplication of content across instances. Assuming that this solution is used, most instances would want their popular communities to be grouped so that their users have access to the popular content at the very least. But if grouping a community means that all posts, comments, media, etc is shared to all members of the group such that if an instance goes down it isn’t lost, then that could have huge data storage impacts on instances. Say I want to set up an instance with the gaming community example grouped in with the lemmy.ml gaming community with say 10,000 posts and 100,000 comments total. Suddenly I have to have storage for all of that content and any associated media (pictures, video). This means any instance that wants to have the popular content will have huge storage burdens even before a single post is created from their own instance.

    So what is my solution? I think instead of syncing all of it across the instances for community groups it should rather be more of a link. So posts which go to one community in the group can be seen by users who subscribe to any of the communities in that group in any instance… if that makes sense. But they are still looking at the post from the instance which it was posted to, not a synced copy. Instances would probably do some caching to prevent lots of queries for popular posts or whatever, but that’s getting too far into the details. The idea would just be to sort of group the subscriptions of the same group rather than the posts of the same group. That does mean that if a instance goes down the content posted from it will go down as well, but it alleviates the burden of hosting all of the entire community group’s content on every instance… so it’s a bit of a compromise.

  • kakes@sh.itjust.works
    link
    fedilink
    arrow-up
    0
    ·
    1 year ago

    I wonder if something like a hashtag system, or built in multi-communities would be a good solution. It’s definitely something I’d say needs to be addressed at some point, but I’m not sure what would be a good solution, especially as I’m not yet familiar with the specifics of ActivityPub. The solution you pose seems to be a good step in the right direction.

    I’ll also say, I don’t think limiting communities to a single instance is the answer, because if that instance ever goes down for whatever reason, the whole community is gone. It should be distributed across instances by design, imo. I’ve seen some people suggesting this, so I wanted to address it.

    • calculuschild@vlemmy.net
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      What about adding some ability for instances to co-host a community? One single community, but the two instances share the load like a distributed server system? Or even at its simplest, one just acts as a backup in case the other goes down?