i still consider myself new python in general, please bear me on this! i'm attempting use scrapy gather data websites. once i've collected data i'd exported csv file. far attempts following code have resulted in files aren't setup tables @ all.
my export code:
scrapy crawl products -o myinfo.csv -t csv
i concluded need write sort of pipeline define column headers are. best of ability meant writing following code in following 2 files.
pipelines.py
class allenheathpipeline(object): def process_item(self, item, spider): return item scrapy.conf import settings scrapy.contrib.exporter import csvitemexporter class allenheathcsvitemexporter(csvitemexporter): def __init__(self, *args, **kwargs): delimiter = settings.get('csv_delimiter', ',') kwargs['delimiter'] = delimiter fields_to_export = settings.get('fields_to_export', []) if fields_to_export : kwargs['fields_to_export'] = fields_to_export super(allenheathcsvitemexporter, self).__init__(*args, **kwargs)
settings.py
bot_name = 'allenheath' spider_modules = ['allenheath.spiders'] newspider_module = 'allenheath.spiders' item_pipelines = { 'allenheath.pipelines.allenheathpipeline': 300, 'allenheath.pipelines.allenheathcsvitemexporter': 800, } feed_exporters = { 'csv': 'allenheath.allen_heath_csv_item_exporter.allenheathcsvitemexporter', } fields_to_export = [ 'model', 'shortdesc', 'desc', 'series' ] csv_delimiter = "\t" # tab
unfortunately, once run export command again:
scrapy crawl products -o myinfo.csv -t csv
i error:
file "c:\allenheath\allenheath\pipelines.py", line 27, in __init__ super(allenheathcsvitemexporter, self).__init__(*args, **kwargs) typeerror: __init__() takes @ least 2 arguments (1 given)
any or guidance appreciated i've hit brick wall here. thank you.
you don't need define pipeline exporting csv.
scrapy handle automagically, information headers taken item definition.
just drop pipeline , try again. btw, -t csv
optional in latest scrapy versions: target format infered filename extension.