python - Scraping the mp3 file from flash player after the conversion -

there textarea , button synthesize on page. looks follows:

        <textarea id="ttstext" name="text" style="font-size: 130%; width: 100%;         height: 120px; padding: 5px;"></textarea>         ...         <div id="audioplayer">             <script>                 create_playback();             </script><audio autoplay="" autobuffer="" controls=""></audio>         </div>         <input id="commitbtn" value="synthesize" type="submit">

when click button synthesize, html code of page change follows (it create audio player).

<div id="audioplayer" style="display: block;"><embed width="370" height="20" flashvars="height=20&amp;width=370&amp;type=mp3&amp;file=http://services.abc.xyz.mp3&amp;showstop=true&amp;usefullscreen=false&amp;autostart=true" allowfullscreen="true" allowscriptaccess="always" quality="high" name="mpl" id="mpl" style="undefined" src="/demo/mediaplayer.swf" type="application/x-shockwave-flash"></div>

i generate mp3 file python code.

what have tried far.

#!/usr/bin/env python # encoding: utf-8 __future__ import unicode_literals contextlib import closing selenium.webdriver import firefox selenium.webdriver.support.ui import webdriverwait import beautifulsoup import time  url = "http://www..."  def texttospeech():   closing(firefox()) browser:     try:       browser.get(url)     except selenium.common.exceptions.timeoutexception:       print "timeout"     browser.find_element_by_id("ttstext").send_keys("hello.")     button = browser.find_element_by_id("commitbtn")     button.click()     time.sleep(10)     webdriverwait(browser, timeout=100).until(       lambda x: x.find_element_by_id('audioplayer'))     src = browser.page_source     return src  def getaudio(source):   soup = beautifulsoup.beautifulsoup(source)   audio = soup.find("div", {"id": "audioplayer"})   return audio.string   if __name__ == "__main__":   print getaudio(texttospeech())

the key success url resulting mp3 file. don't know how wait script change html (inner text of <div id="audioplayer">). code returns none, because takes result sooner.

in case of changes, not enough wait element:

webdriverwait(browser, timeout=100).until(       lambda x: x.find_element_by_id('audioplayer'))

but need wait change condition, using expectedcondition. started (not tested):

from selenium.webdriver.support import expected_conditions ec wait_text = 'file=http://' element = webdriverwait(driver, 10).until(         ec.text_to_be_present_in_element((by.id, "mydynamicelement"), wait_text)     )

you can checkout expected conditions here: http://selenium-python.readthedocs.org/en/latest/api.html?highlight=text_to_be_present_in_element#module-selenium.webdriver.support.expected_conditions

Autos

Search This Blog

python - Scraping the mp3 file from flash player after the conversion -