I have created some software that is capable of synchronising posts from Reddit to Lemmy. It’s still a little rough around the edges, but it works as a such:

People can request new subreddits to be mirrored on [email protected]. A bot (open source) will monitor the threads there, and if it finds a new request for a subreddit, it will make a new community on the Lemmit server, and add it to its monitored list. It will then make periodic checks to see if any new posts (it doesn’t copy any comments) have been posted on reddit, and copy those over.

Users can then subscribe to those communities from their own lemmy instance, and from there federation will pick it up. Or at least, that’s the theory. At the moment, federation is not working awesomely, and that is where my lack of fediverse knowledge comes in. Maybe it needs more time, or something is not so properly - I don’t know.

Furthermore: registrations on this server are closed. The point of this service is not to become a community on its own, but to deliver, ehh, “original” content to all the rest of the Fediverse while it’s going through a ramp-up phase. Besides, the instance is running on a pretty small vps, and I rather have this thing manage itself. There is a [email protected] community for further questions about the project itself though, in case people want to discuss it further.

So ehm… Let me know what you think :)

  • sunaurus@lemm.ee
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    1 year ago

    Interesting idea! I have some thoughts if you’re open to feedback:

    Furthermore: registrations on this server are closed. The point of this service is not to become a community on its own, but to deliver, ehh, “original” content to all the rest of the Fediverse while it’s going through a ramp-up phase.

    Have you considered moderation? These mirrored communities on lemmit.online will still be getting comments from all over the federated network, and if you’re the only user and sole moderator of every community, then it might get quite overwhelming!

    Besides, the instance is running on a pretty small vps, and I rather have this thing manage itself.

    Just in case you’re not aware, your instance will need to be able to handle:

    • Pushing out posts and comments to all other instances in the network
    • Accepting comments and votes from subscribers on any instance from the network

    A small VPS might not be able to handle that!

    It will then make periodic checks to see if any new posts (it doesn’t copy any comments) have been posted on reddit, and copy those over.

    How are you planning to deal with API limits from Reddit? Without paying, at most you’ll be able to make 6000 requests per hour, which means that you’ll only be able to get new posts from the last hour for up to 6000 subreddits. It might seem like a big number, but consider that there are (according to some old posts online) over 100,000 active subreddits.

    • admin@lemmit.onlineOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Interesting idea! I have some thoughts if you’re open to feedback:

      Always!

      Have you considered moderation? These mirrored communities on lemmit.online will still be getting comments from all over the federated network, and if you’re the only user and sole moderator of every community, then it might get quite overwhelming!

      I have, and I hope it won’t be a problem ;) I’m a software engineer, as mentioned above, have little interest in managing people outside of work :P If anyone wants to become a moderator, they’re free to request it.

      A small VPS might not be able to handle that

      We’ll see how well it does. I don’t mind spending a little money on this (few dozen €/$ per month), if it takes off. In the end though, it’s more meant as a kickstart for Lemmy content than anything else.

      How are you planning to deal with API limits from Reddit?

      HA! By not using the API. For starters, because someone-who-isnt-me would like to browse NSFW content. I do a bit of client-side throttling between requests, which I hope will keep me under the radar. But it’s mostly based on rss for the subreddit overview, and scraping for the individual posts.

      In the end… we’ll just have to see how it goes.

    • Zamboniman@lemmy.ca
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      1 year ago

      The developer isn’t using the API. They’re scraping according to my question and their response to this above. However, the moderation question is a really good point. The easy workaround for this is just set every new community as ‘only moderators can post’ and then it’s just content read-only.