Hi every one,  I have this error when using SqlSer...
# ask-community
s
Hi every one,  I have this error when using SqlServerExecute `ImportError: Using
prefect.tasks.sql_server
requires Prefect to be installed with the "sql_server" extra`. I am using, docker for storage using a docker file as explained here. My code looks like this
Copy code
from prefect.tasks.sql_server import SqlServerExecute

STORAGE = Docker(registry_url="<http://xxx.azurecr.io|xxx.azurecr.io>",
         image_name="prefect/test",
         dockerfile="Dockerfile",
         python_dependencies=["pyodbc"])

SQL = SqlServerExecute(db_name="myDb",
            user="user",
            host="<tcp:xxx.database.windows.net>",
            port=1433,
            driver="ODBC Driver 17 for SQL Server")

QUERY = """
  CREATE TABLE [dbo].[myTable] (
    [id] VARCHAR(22) COLLATE Latin1_General_100_BIN2 NOT NULL,
    [FirstName] VARCHAR(222) NULL,
    [LastName] VARCHAR(222) NULL
  )
"""

@task
def create_table():
  logger = prefect.context.get('logger')
  <http://logger.info|logger.info>("Create table in DW for loading the results")
  SQL.run(query=QUERY, password='myPass')

with Flow("candidates-flow", storage=STORAGE, run_config=RUN_CONFIG, executor=EXECUTOR) as flow:
  create_table()
Can someone help? I do not understand the error. Many thanks already 🙏
I forgot to mention the direct cause of the error is
ModuleNotFoundError: No module named 'pyodbc'
, which I do not really understand since I am already adding pyodbc in python_dependencies and the dockerfile already customised to used this. Looks like pyodbc is not get intsalled 😕
k
Hi @Salohy, I am not seeing the RUN_CONFIG here. Could you show me what your RUN_CONFIG looks like?
s
my RUN_CONFIG is kubernetes RUN_CONFIG = KubernetesRun()
HI @Kevin Kho 🙂
and the executor is a dask EXECUTOR = DaskExecutor(address=“tcp://my-ip:8786”) …
k
I think the issue here might be that KubernetesRun is not pulling the image that you defined, which is why it doesn’t have
pyodbc
. Can you try passing it in?
🆗 1
s
I have the same error after passing the image to KubernetesRun
k
How did you pass it?
s
RUN_CONFIG = KubernetesRun(image=“image:latest”), right?
k
I think what’s happening is that image is not updating because Kubernetes image pull policy is
IfNotPresent
by default so it doesn’t update the image because it sees one locally with that name. You want to change this to
Always
. This PR may help you: https://github.com/PrefectHQ/prefect/pull/4445/files
👀 1
s
Unfortunately, I still have the same error. I even changed the name of the image but I still have
ModuleNotFoundError: No module named 'pyodbc'
k
Can you check the logs of the agent and see if it’s pulling the right image?
✔️ 1
s
This is the logs of the agent
Copy code
[2021-05-11 15:15:06,631] INFO - agent | Found 1 flow run(s) to submit for execution.
INFO:agent:Found 1 flow run(s) to submit for execution.
INFO:agent:Deploying flow run 79c0af83-ffd1-4b09-ae26-e34f8a0ebd88
[2021-05-11 15:15:06,819] INFO - agent | Deploying flow run 79c0af83-ffd1-4b09-ae26-e34f8a0ebd88
And When I describe the pod of the job I get 
Copy code
Type  Reason   Age  From        Message
 ----  ------   ---- ----        -------
 Normal Scheduled 18s  default-scheduler Successfully assigned default/prefect-job-15fad83d-54hg7 to aks-agentpool-47953602-vmss000000
 Normal Pulling  17s  kubelet      Pulling image "<http://xxx.azurecr.io/prefect/kn-candidate:latest|xxx.azurecr.io/prefect/kn-candidate:latest>"
 Normal Pulled   16s  kubelet      Successfully pulled image "<http://dexter.azurecr.io/prefect/kn-candidate:latest|dexter.azurecr.io/prefect/kn-candidate:latest>" in 1.002010051s
 Normal Created  16s  kubelet      Created container flow
 Normal Started  16s  kubelet      Started container flow
k
And then the flow run gives the import error?
s
yes, then the job is passed to the dask scheduler and the workers output 
Copy code
Traceback (most recent call last):
 File "/opt/conda/lib/python3.8/site-packages/distributed/worker.py", line 2632, in _maybe_deserialize_task
  function, args, kwargs = _deserialize(*ts.runspec)
 File "/opt/conda/lib/python3.8/site-packages/distributed/worker.py", line 3488, in _deserialize
  kwargs = pickle.loads(kwargs)
 File "/opt/conda/lib/python3.8/site-packages/distributed/protocol/pickle.py", line 75, in loads
  return pickle.loads(x)
 File "/opt/conda/lib/python3.8/site-packages/cloudpickle/cloudpickle.py", line 562, in subimport
  __import__(name)
ModuleNotFoundError: No module named 'pyodbc'
The logs in prefect cloud has the same output
k
I see. I think the workers are not getting pyodbc. Can you try a LocalDaskExecutor/LocalExecutor first? If that works, then I think our problem is installing pyodbc on Dask workers
s
does it mean that we need to install all extra packages in dask workers?
k
In general yes, client packages are not automatically installed on workers and sometimes version mismatches even break things. A lot of people spin up their own Dask cluster and provide an image while doing it to load all dependencies.
In your case, it seems you already have a cluster up and running so you need to get them installed on the workers.
s
Ah okay, will try that 🙂
👍 1
hey @Kevin Kho many thanks 🤗 it works after installing pyodbc in the dask workers 🙏😃
k
Nice! 👍 Happy to help