Andy Reid@lemmy.world to Technology@lemmy.worldEnglish · 2 years agoAI companies are violating a basic social contract of the web and and ignoring robots.txtwww.theverge.comexternal-linkmessage-square27linkfedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkAI companies are violating a basic social contract of the web and and ignoring robots.txtwww.theverge.comAndy Reid@lemmy.world to Technology@lemmy.worldEnglish · 2 years agomessage-square27linkfedilink
minus-squareKillingTimeItself@lemmy.dbzer0.comBannedlinkfedilinkEnglisharrow-up0·2 years agohmm, i though websites just blocked crawler traffic directly? I know one site in particular has rules about it, and will even go so far as to ban you permanently if you continually ignore them.
minus-squareBogasse@lemmy.mllinkfedilinkEnglisharrow-up1·2 years agoDetecting crawlers can be easier said than done 🙁
minus-squareKillingTimeItself@lemmy.dbzer0.comBannedlinkfedilinkEnglisharrow-up0arrow-down1·2 years agoi mean yeah, but at a certain point you just have to accept that it’s going to be crawled. The obviously negligent ones are easy to block.
hmm, i though websites just blocked crawler traffic directly? I know one site in particular has rules about it, and will even go so far as to ban you permanently if you continually ignore them.
Detecting crawlers can be easier said than done 🙁
i mean yeah, but at a certain point you just have to accept that it’s going to be crawled. The obviously negligent ones are easy to block.