On the matter of duplicate posts...

On the matter of duplicate posts…

Home Page Forums General Chat On the matter of duplicate posts…

This topic contains 22 replies, has 9 voices, and was last updated by Profile photo of SparklePants SparklePants 2 days, 10 hours ago.

Viewing 15 posts - 1 through 15 (of 23 total)
Author Posts
Author Posts
Profile photo of Tomsk

Tomsk

said

For the attention of each of us!

From time to time I dare say I’m not the only one that stumbles upon a duplicate post. While I don’t know how others deal with this, if at all, I for one always report them to either Hunter or Monk for resolution regardless of who made the posts(myself included).

As the site grows it’s inevitable that this will happen but a simple search or two can tell us if an item is already here or not. I can understand it when an item originally appeared in a bundle and may not be individually searchable, one can easily be lead to believe that it hasn’t been posted.

Even then, when we’re filling in the submission boxes and asked for links, is it not easy enough to double check to see if it was part of a larger bundle at that point?

All I could suggest is that we all take the time to do a proper search, thus avoiding any bad blood further down the line when you suddenly realise that no-one is downloading what you shared because it’s no longer here! No-one likes to waste their time.

Please use the original title of the goods in question, copy/pasted directly from the source where possible, this will make items more easily searchable and help avoid duplicate posts due to abbreviation usage or spelling mistakes.

Regards, Tomsk.

@hunter @Monk

January 11, 2017 at 6:12 pm
Profile photo of Hunter

Hunter

said

Yes please use original post name
It will help to find duplicate content.
And when you see duplicate content just report it ..
click report and select spam and in text box put the original post link.
Me or monk will delete the duplicate content.

January 11, 2017 at 6:22 pm
Profile photo of eelgoo

eelgoo

said

On the topic of searching.
Keep your search criteria simple.
That way you are more likely to get relevant hits.
For instance, if the character name is ‘Susan’ just search for that rather than exhaustive specific permutations of what the title might be.

🙂

January 11, 2017 at 6:23 pm
Profile photo of AndyMc4

AndyMc4

said

I follow a simple process when I’m browsing the blog that I respectfully suggest everyone should consider. I always skip ahead a number of pages into the blog, and keep going until I see a page that I recognize from a previous visit. I don’t open posts yet, and certainly don’t download anything. Then, I start reading backwards towards the first page and open the first (and only the first) post of any interesting item that I encounter. Of course, when I say first, what I really mean is earliest, because I’m reading backwards.

Even if we assume that the vast majority of duplicates are caused by carelessness, not by a deliberate attempt to usurp someone else’s post, this approach still provides an incentive to avoid duplicates. It takes time to put a post together. If people check for duplicates a bit more carefully before they create a post, they can avoid wasting their time creating a post that no-one will read. This won’t stop duplication, there will always be honest mistakes, but it might help to reduce it.

I don’t want to get into a debate about whether we should have points on Zone, let’s just take it as read that we do. I would like to see duplication reduced, and anything that might help is worth thinking about.

January 12, 2017 at 1:23 am
Profile photo of 3Daz3D

3Daz3D

said

@hunter, et al.

Hi all, duplicates are definitely a real problem with organization’s data so I offer you a simple IT solution that I created with the company I work for. It just requires SQL and the use of a scripting language that works on your installation.

When I cleaned data for my client’s WordPress site, we had a MySQL database with membership and product tables. A member could have entered data for a certain product more than once but we needed to find unique instances of the first person that entered a product into a catalog to give bonus credits to the first one that entered the product. I think we can use the same pseudocode-logic to help solve the duplicates problem here at ZGFX.

Step 1. Use SQL to make a list of all Post IDs, Date they were entered into your system, Source hyperlink where duplicate original source HTTP links exist. I know I said ‘where’ but there will be ‘Group By’ and ‘Having’ clauses in this query instead. Order By the entry date ASCending. Dump the list on to a delimited text file

Step 2. Clean the list of the first instance of a post’s original link on Daz/Rendo/wherever. The first instance of the hyperlink is the first post that made it into the system. You can automate this with your scripting language of choice, I use Perl since it is the bomb with text handling. Once you remove the first instance you are left with records that are duplicates that got posted after the first original poster.

Step 3. Collect the remaining POST IDs. Again with your scripting language of choice collect just the POST IDs.

Step 4. Build a text file of the collected IDs and arrange them in a comma separated manner and encapsulated by ‘()’. You see where I am going next right? 😉

Step 5. Build the DELETE SQL to remove the collected IDs aka.
DELETE FROM ZONEGFX_POST_TABLE WHERE POST_ID IN (1,2,3,4,5,77,778,789…) <- data provided by step 4.

Yes, this solution is long winded and you can do all these steps in a gigantic spaghetti-coded complex SQL call all at once but for maintainability, the steps are broken down into 5 easy chunks. Also, these five steps can be executed through a cronjob on your web server so it happens once or twice a week on a schedule.
I foresee you will need to run this solution for the first time to dedupe your entire database but later, you can specify in your SQL a smaller chunk of time to dedupe since the entire database is ‘clean’ going forward. I would suggest running this automatically during the maintenance window of your site.

If you want to implement something like this and need assistance do not hesitate to PM me. It is a simple process to wrap code around…

January 12, 2017 at 10:07 pm
Profile photo of charite

charite

said

I noticed a very obvious dupe on the blog today and the approval page also has several of them on it. It seems almost like the uploader don’t even care, and even worse, they seem to be approved.

Right now there are 2 Genevieve 7 Pro Bundle approved almost next to each other on the blog. I have no clue about how the approve processes works and I am very grateful for the work that goes into this page. But that one should have been easy to spot.

I have a system to eliminate dupes from my own collection, but it has 1-2 days delay. On other sites it rarely mattered as the items don’t cost anything. But here they do, so it is annoying to spend points on duplicates.

January 12, 2017 at 10:14 pm
Profile photo of Tomsk

Tomsk

said

It looks like the Genevieve Pro bundle is an accidental duplicate post by gicor so no harm done there, as for the others I couldn’t say.

January 12, 2017 at 10:59 pm
Profile photo of 3Daz3D

3Daz3D

said

Addendum,
Step 6 would be ‘Run the DELETE SQL’ LOL. After thinking about it more, In step 1 ORDER BY Entry date AND POST ID OR if the post id is created sequentially when it arrives into the system then using the post_id number would suffice. so a post id of 12 means it was the 12th record in and post id 3678 with the duplicate source http is clearly a dupe that followed post id 12.

As a band-aid to fix the going-forward duplicate problems –

Construct some PHP to take the original source link of the product, executes sql to query the database and if the source link already exists there, the ADD Post page PHP code stops the post from being added into the ZGFX Blog database and gives the poster a nice ‘Sorry, product already exists’ message.

January 12, 2017 at 11:01 pm
Profile photo of Hunter

Hunter

said

Sound complex 🙁 any way i will try to make it.
But for now please report duplicate post please.
Do not post duplicate post report in forum ..
in post you will see report button .. just report dub post from there .. for now please.

January 13, 2017 at 6:49 am
Profile photo of charite

charite

said

Hunter, I have reported 4-5 duplicates the last 2 days and they are still there, one of them even has a missing image.

January 13, 2017 at 9:02 am
Profile photo of charite

charite

said

Thanks, I really have to remember that @ trick here.

@hunter, I have reported 4-5 duplicates the last 2 days and they are still there, one of them even has a missing image.

January 13, 2017 at 10:10 am
Profile photo of charite

charite

said

I could not help notice that Somebody seems to be splitting a product into DS and PS versions and wants them approved separately. While it is not a duplicate as such, it is a dirty trick, and I think it would be a shame if people started to upload products in the individual components.

January 13, 2017 at 12:27 pm
Profile photo of Colle O'Grbage

Colle O’Grbage

said

you should have let that happen so we could see who’s the naive wannabe trickster

but perhaps monk will deal with it on his own anyway

January 13, 2017 at 12:49 pm
Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic.

Copyright © 2013 ZoneGfx. All rights reserved.                                                            Follow: Twitter | Facebook | Pinterest